Internal teams needed a way to onboard their data sets to the data warehouse and schedule them for automatic ingestion. Teams then pull these data sets into a reporting application where they build dashboards and reports to share out. The current process required tickets to other teams and weeks to complete. My task was to design an application that collects information about the data set, ensures it’s formatted correctly, sets up space and permission on the server to accept the data upload, and optionally schedules it to auto-ingest. I first had to learn about data sets and transformation, and then I was able to create task and data flows to show how things should move through the system. I then interviewed users to understand their needs and began ideating onboarding screens and getting feedback. Once I finalized the designs, the UI developer took over. The result was an application that reduced a 2-3 week manual process down to a 7-10 minute task users could do themselves. We made a lot of people happy and I learned a lot about data sets and formats.
My role: UX research and design, project management, light visual design, user guide content creation
GETTING STARTED - TASK & SYSTEM FLOWS
While the engineers were pretty clear on what needed to happen, I needed some help understanding this complex project and the basics of data storage and transfer so I could better understand what I needed to build. My lead engineer was fantastic in explaining the details (he used the "squares to triangles" analogy below), and from our conversations I was able to construct flows that helped me understand how the data would move through our systems and what would happen at each step. I discovered there would be three distinct phases to this process. The first begins with the user uploading a JSON file (the "square"). Our system needs to understand the data in the JSON and transform it into data categories in the data warehouse (the "triangle"). It outputs this data in a table that the user can then verify. Once the user thinks everything is in order, they move to step two. This allows the user to upload a test file that our system compares against the table from step one to see if it can correctly output the data into the correct structure (we're sure the "square" was successfully converted to a "triangle"). If step two is successful, the user can move to step three, which is scheduling their data streams for automatic ingestion.
THE FIRST FEW VERSIONS
The first version for me is just getting ideas out of my head into some kind of structure. I don't worry about it too much because I know it will bring up new questions. I like to explore different ideas, even if at the outset I don't think one will work, because it forces me to look at things from another angle. It's sometimes just as important to sketch the wrong thing as it is the right so I can validate my thinking. There were many steps in the onboarding journey so I tried several different ways of walking the user through each part.
At first I thought the user would need and/or want to complete all three flows all at once. When showing these to potential users, including my own teammates, I discovered they wanted to keep those three flows distinctly separate from each other. They may want to just complete the first flow and make sure everything works, or make updates to their initial data table if there were problems. The second step involves actually creating space on the server for the default table template, so users would only want to do this once they were sure they were ready to deploy their data and actually upload a data set. The third flow, automatic ingestion, might not even be turned on by some teams because they would want to always manually upload their data. That feedback led me to abandon the progress indicators across the top altogether. However, I did keep the numbered steps down the left to help keep users working through all the required steps to complete each flow and be able to save.
AUTOMATE ACTIONS FOR THE USER WHERE POSSIBLE
After I had the rough interactions figured out and prototyped, my focus changed to some of the smaller details we'd put off. One of those details was retention and quotas, or how much space on the server each data feed would be allowed and how long we'd retain the data. This is important because cloud server space was at a premium and each team gets billed for how much they use. Therefore, I wanted to inform them of how much space they were using, and also make sure they wouldn't run out without warning and cause data reporting errors. Our lead engineer suggested a notification system warning teams when they were approaching their limit and asking them to request more. I asked if we would ever deny allocating them more space and the response from our higher ups was no. I instead suggested we just automatically bump up their allotment for them, as notifications are easily ignored or forgotten. Why make the user take an action when the system can do it for them? The team agreed. I kept the Quota Usage column so teams could remain informed of how much space they were using so they could clean up their usage bills when required.
HERE COMES THE COMPLIANCE TEAM...
About this time the compliance team heard about our project and requested a meeting. As I was also handling PM duties, I took responsibility for gathering their requirements. Their concern was around what types of data might be sent to the data warehouse and who all would be able to access it. Data such as email addresses are considered personally protected information (PPI) and thus we needed to start thinking about how to recognize such data and (ideally) automatically redact it. Additionally, we needed to build a way for teams to restrict their data views to certain teams or individuals. Since this option would require less engineering effort to implement than redaction, I started there. This functionality had to be included before we could release our application to wider audiences, so I identified a temporary location in the first flow where we could gather and store these permissions. I knew that down the line, more permissions functionality would be required and we'd need to build some type of admin area.
We didn't quite meet our very short deadline due to some authorization issues the engineers ran into on the data warehouse side (auth is widely known to throw monkey wrenches around), but we still got this out in a very short timeline - about 3 months. There is still a lot to do, and I'd loved to have had more time to focus on the visual design, but this is a big win for our team. What was previously a 2-3 week manual process requiring Jira tickets between teams has been reduced to an approximately 7-10 minute task teams can do themselves. We received many accolades from our higher ups for meeting this goal and adoption is spreading as our initial beta testers spread the word throughout the company about its availability.
Here are the final screens for the v1 release.