Thursday, November 30, 2017

Exploring Data in Paco

New Feature!

Once you have run an experiment and collected data in Paco, you probably want to see if the data answers any of your research questions. In the past, Paco researchers have had to export their data using our CSV, HTML, or JSON report generation functionality to get the data in a form for importing into a statistics tool. Now, you can answer basic questions about your data on the website without exporting.


New Data Exploration/Visualization Builder Wizard


We have been developing a framework for specifying and executing visualizations to support data exploration. The basics of the framework are in production and will allow us to continue to add more capabilities to the system as the need arises. We'll talk about that in another blog post, first, let's look at what it can do right now.

What Questions Can You Ask Right Now?

Data exploration is an iterative process that a data analyst uses to follow the data and hopefully prove or disprove some hypotheses. Another aspect of visualization is to show answers graphically in reports. Paco's tooling is focused on exploration at the moment. You can save a visualization for use in a report, but, first you have to explore the data to find answers worth sharing. 


Current Data Exploration Question Options

  • How is data distributed?
  • What does a variable do over time?
  • What are the relationships between variables?
  • What do phone app sessions look like? 


Distribution Questions

To get started exploring data, the first questions are usually about the distribution of the data. Does it generally lump around certain values? How spread out is it? What are the outliers? Are those outliers just noise or an indication of something real in the data set?


BoxPlot of Distribution of "Importance" of needs across participants


Bubble Chart of distribution of "Location" of participants

Time Series Questions

The point, one might say, to the longitudinal studies that Paco facilitates is to see what happens over time. Is a behavior actually a routine? Does a behavior pattern change after an intervention? 

So, Paco now allows you to plot a variable over time.

Reaction times over three days (Reflex experiment)

Relationship Questions

After the individual data items are characterized, questions often come up about the relationships between variables, does variable x have any effect on variable y? Are they correlated? Enter the scatter plot in Paco.


Scatter plot of relationship between reported Importance and Quickness required to find information


App Usage Sessions

For the specific case of app usage logs collected over weeks or months for users, Paco can show a time plot of each app usage session with the time of start and what apps were used in order in that session over the course of the study.


App Usage sessions for one participant over three days
App Usage Sessions over three days for one study participant