Decide on an architectural style

Your go-to forum for bot dataset expertise.
Post Reply
jrineakter
Posts: 899
Joined: Thu Jan 02, 2025 7:13 am

Decide on an architectural style

Post by jrineakter »

Build an analytics backlog
Create a list of metrics that the business needs or wants. It’s usually best to phrase the metrics as questions like, “what is the daily average session length of visitors to our website” or “what is our average order value for a certain time period?” This is the equivalent of user stories in software development. By starting with high-value questions, you’ll see patterns emerge that can help with your next step.

Like a well-designed application or piece of software, the data in your data warehouse or data lake should conform to an architectural style. You can select the style based on the kinds of questions in your analytics backlog and the shape and types of data you predominantly have available in your enterprise (star-schema, snowflake-schema, data vaults, and many other denormalized formats). Think about layers of data models from raw data to clean data to transformed analytic models. You can compare this layering to layering software from raw API to business logic to UX.

The architectural style you choose will have a big impact on how analysts and data scientists access and use the data. Applying this style consistently will make your data platform much more usable for all data consumers. At data.world, we use a star schema layout and ELT (extract, load, transform) architectural pattern. The star-schema layout of fact and dimension tables works particularly well for tracking the activity of our membership base but also pivoting our analytics on time period or by customer org.

3. Select a toolchain
Once you have an architectural style and a backlog of analytics stories, it’s time to choose some tools. How well these tools work together is critical to maintaining agility in a world france whatsapp number data with ever-expanding data science and analytics use cases. Different data platforms support different architectural styles. The linchpins of the toolchain are your data platform/query layer, your ETL/data-integration tooling, and your data catalog.

Data quality, profiling, lineage, and other tools can be integrated as your use matures. Having a data catalog with an open and flexible metadata model is critical to adding new tools over time. It also gives you the basis to expand your BI, ML/AI, and data science toolbox to support data consumers over time as well. At data.world, we’ve adopted JIRA to manage our analytics backlog, Snowflake for our data platform, DBT for transforms, and a variety of analytics tools. All of this is coordinated via a data.world data catalog.

4. Gather your team
Now it’s time to bring together the data consumers and producers who will be working on the initial analytics stories. Good agile processes incorporate a diversity of stakeholders at every touchpoint. This keeps feedback loops tight and might be the single most important thing that drives adoption. Consider who will coordinate your data sprints as well. Anoint someone to play the role of data product manager or owner at this point too. Data engineers, stewards, and product managers cannot go into a cave for months only to emerge and expect analysts and data sciences to start using the results.
Post Reply