Data
Onboarding

A data management tool that enables non-technical user to easily extract data from external system with desired transformation and load into [m]Platform database. This feature will be launched on April, 2020.

[m]Platform users work with advertising clients, they constantly have the needs to bring multiple client's data sources together for audience analysis and segmentation. The process is currently very manual and highly dependent on tech support. This design aims to making it easy for non-technical user to ingest data and centralize the data storage to keep growing the understanding on potential consumers.


"Time to value is the issue of ETL. It takes too long from getting data source to being able to use the data. A shortcut is greatly needed to keep our clients happy."

- User from Xaxis

PROJECT DURATION

Apr 2019 - Nov 2019


PROJECT TYPE

New product design


PRODUCT TAG

Ad tech
SaaS web application
Data management

My Role


I was the design lead on this project. Because this is a new product, I was responsible for end-to-end product design from concept to implementation. I worked with architects, prodcut owner, engineering lead to define functions, and came up with user flows by brainstorming with PMs. I created wireframes to socialize within the team to collect feedback, and build hi-fi prototype for user testing. I also planned user research and did user interview with agency users.

Design plays a key role in the making of the product since the requirements are not fully thought through at day 1. Visualizing the user experience is a way for the entire team to approach the ambiguity and make up chunks of actionable item.

I started from the technical requirements: ETL. What is it?


The project started with a clear goal: build an ETL (Extract, Transform, Load) tool for marketers and advertisers to import data into [m]Platform for data visualization and segmentation. I build the user flow based on this technical process.

Extract data from the source system, this will need authentication and security settings to make it happen.

Redefine the raw data based on the new business requirements or target data structure etc.

Load the data into a data model that is well defined for specific business goal such as building audience.

Finish E, T, L tasks in ONE screen: the "Hub-n-spoke" approach

Initial Proposal

My initial proposal is based on the ETL model. Extracting source data from the left side, and selecting target data object on the right side, then transformation in the middle while connecting from the left to the right. In this interactive approach, all of the tasks can be done in one screen and it's very intuitive for user to understand the process.

I conducted a usability testing to validate the concept and got pretty positive feedback. From usability perspective, this is a successful proposal with a good look and feel. The UI layout echos to the technical process so user can easily follow the concept of setting up data in the data model. Unexpectedly, I got push back from the backend.

DESIGN CHALLENGE 1

Technology architecture changed, so would the mental model.


Since this is a new product, the UX design and technical architecture design was almost happening in parallel. When we planned to deliver the design to UI Engineering team for implementation, I was notified the technology architecture has changed to a different direction, which could significantly impact the UX. Shifting the mental model towards the new architecture was a big challenge for me.

Rethink the flow. But again, does the technology architecture change really matter to user experience?


I hosted a whiteboarding session with PM and architect, to sketch out what's happening under the hood. We talked through all of the changes to prioritize what are important to users.

User don't have to know everything. But for a data ingestion app, it's important to have transparency about how the data is flowing and how it is stored in the system. Now that we don't directly push the source data to the data model, we have to surface the actual data storage between them.

Explorations

1) Display the raw data, transformed data, and data model object from left to right. Then use some UI transitions to guide user through the two tasks between them.


2) Display the extracted data table and user can transform based on the table. Then user can map the transformed data to the data model object.


3) Use the mapper to define the "data in", and user define the mappings in another flow.

If ingested data is not directly going into the data model, should mapping to data model be a separate flow?


When data model is the destination of the ingested data, data mapping is naturally involved in the flow because it defines how the data flow in. If data do not flow into data model, it is not necessary to include data mappings in the definition of data ingestion.

Data mapping is a required step to make data usable, but I still decide to separate it from the flow.


PROs

Give user a sense that they can easily bring data in as is, without complying with the rules in the new system.

User can wait the data assets to refresh for at least 1 time to validate the data source is running correctly before making data generally available for other users in the organization.

Keep data model aside from data ingestion is a better strategy to make data model always clean and managable.

Separate the flow to make both of the user tasks short and approachable.


CONs

Some user will think they can start using the data after they bring the data into the system, the continuity of the flow is broken.

User will not have the context of data model when they transform the data in the creation of data assets.

"Before I make the data consumable for my business users, I want to be able to validate the data by checking the refresh status. Otherwise user could potentially create something bad."


"Data setup is less about speed, but more about trust and confidence."


"We normally plan our data model first and then ingest data into it. I am not very confortable to modify my data model based on the incoming data."


- from user research

Break the flow, but keep the sense of continuity about the tasks.


Breaking the data onboarding flow into 2 parts was well recieved by some technical users who understand database. However, users with less technical background think data will directly become available for use once they deploy the data asset. Therefore, we need to make sure user still see the continuity of the tasks as we separate them out.

DESIGN CHALLENGE 2

Make data mapping easy for non-technical user, while scale for complex scenarios.


Data model is the foundation of any utilization of the data in the system. When user bring in new data, the data asset will need to be mapped to the data model to become available for use. How to design a data mapper that doesn’t require good understanding about data model is the challenge for me.

Envision the user interface from the analogy of "ETL Pipelines"

Initial Proposal

The line mapper approach makes a lot of sense when the target data object is the data storage. The lines match with the analogy of ETL “pipelines” which means the data is from some place to a new place and being transformed during the transition.

The line mapper makes tons of sense when we originally had data model as the target of mapping, so data "flow" from source to target. However after the architecture change, is that still a good analogy to visualize the user interaction? Probably not, and there are other problems.

Problem #1: confusion between map and join


Some user got confused between the mapping and the data relationships they see in a regular ERD view. The line connecting from one data set to another reminds user of the ERD.

“I expect after I connect the keys, all of the attributes in my table will be available for use in data visualization and consumer segmentation”


This is a very interesting comment I noticed in the user research. Even though we told the participant this is a data mapping task, the visual form of the interface led him to think this is setting relationships between two tables.

Problem #2: Scalability could be tricky


Since the lines connecting from left to right, we can hardly scroll in the left or right panel. But in that way, you will probably connect a cell from top left to the very bottom right corner. That would be a bad user experience.

Data mapping interaction explorations

Iteration

I explored 4 other interaction approaches for the data mapping: dropdown menu selection, tagging, pairing, and drag-drop. They are not just different in the micro interaction but also variant in terms of the mental model focus if we see them in the affinity diagram below.

Easy for development, but less user friendly

Tagging data asset is a good analogy

Feeding data model is a good analogy

Data Asset and data model are equivalent

User focus is on the data model. Data mapping is feeding data model with the right data assets.


Eventhough our architect and engineers think the concpet of "tagging" exactly explains what really happens in the background, I found a lot of insights from previous user research showing that user always focus more on the data model. After all, the data model is what eventually defines how the data is being used, so it should be given more attention.

Data mapping is the happy path. Non-technical users can easily map the ingested data into the out-of-the-box data model that we have pre-defined for them. But that never covers all of the data needs. What if user have some metadata that we don't have in our pre modeled schema? I designed the feature to add custom objects and fields based on the mapper UI.

Adding custom data, we want to make it easy for users but it can not be too easy.


For most data awared advertisers or marketers, data model is a warning zone. Being able to create data model object on the fly while bringing in new data sounds pretty desirable, but also a little bit intimidating on the other side. So I tried to provide the convinent UX of customizing the data model, but let user see the modification on data model as a serious action.

If no standard object to map to, user can bring data asset into data model as a new object

If the standard object does not have some fields, user can add batch of custom fields to the object from the data asset.

Data model should not be easily modified, but users also don't want to stop their work flow and wait for help. So we let user do whatever they want, and admin user will be notified about the changes and give approvals.

Expeirence the prototype

Other Works