Four Features that Every Academic Research Data Platform Should Have
Ever spend hours downloading a massive dataset, only to realize it’s not a great fit for your research after all?
If so, you’re not alone. The process of discovering, vetting, and accessing big data can cause major headaches for many academic researchers – and really slows down the research process.
It doesn’t have to be this way! With more and more data providers realizing the benefits of academic partnerships, Universities have more choices than ever on how to access research data. When evaluating a data provider or platform to purchase data, here are four crucial capabilities to consider that will save you time, money, and stress.
1. Freedom to explore before purchase
If the opening line to this blog brought back some bad memories – we’re sorry! All too often, data companies don't provide adequate information about their datasets to help researchers understand what's included, which can lead to frustration down the road. This is why it’s so important to find a data provider that will let you explore the data before you commit.
On Dewey, users can download a 100-row sample of any dataset on our platform before subscribing to ensure that the data provides exactly what you need. You’ll also be able to access key details such as full data dictionaries, variable descriptions, and methodology resources at no cost.
Getting access to new and unconventional datasets has completely changed the direction of my research pipeline. Getting access to them on a platform designed for academics is icing on the cake. - Nicholas Hallman, Assistant Professor, UT Austin
2. Ability to download only what you need
Working with big data can be a stressor on the hard drive. If you've found the perfect dataset but don’t need the whole thing, you should be able to customize your export to fit your project’s specific requirements.
The Dewey platform enables subscribers to save file space (and precious time) by downloading exactly what you need.
For example, let’s say you’re interested in SafeGraph’s Global Places POI dataset, but only the data from Seattle, San Francisco, Los Angeles, and San Diego. You can quickly filter on those four values – in this case, taking your file size down from nearly 48 million rows down to half a million.
Leveraging this custom filtering, with data visualizations included, helps researchers to answer questions like ‘Are the exact domains I am looking for included?’ or ‘How many states have over 100 businesses included?’ before spending time exporting the data.
3. Options to consume data your way
Researchers often have their own personal preferences when it comes to the tools they use for working with data. In many cases, the University’s systems and other paid services dictate the options to consume data from a third party. For companies delivering data, flexibility is key.
Looking at the last year of data consumption from the Dewey platform, you can see a majority of researchers using an API connection or simple CSV download to access data. We hypothesize that API makes sense for large scale analysis at universities with high-performance computing clusters. When such services are not available, BigQuery, Google Cloud, and S3 Buckets are still frequently used.
Whether you are a wiz at Snowflake Sharing or prefer a simple download, we have you covered with integration options, many of which will regularly refresh to ensure the most up-to-date data.
4. Resources that actually answer your questions
As a Ph.D. student, I need a data hub that is user-friendly, organized, and most importantly affordable! From the early days of learning how to code to advanced spatial analysis, Dewey has been a critical support each step of the way! - Karina Amalbert, PhD Student, FSU
Discovering, licensing, and consuming data for your research can be a complex process with various roadblocks along the way. Academic researchers should make sure to work with data providers that have clear documentation readily available and offer access to live support personnel, ensuring your research won’t get delayed due to lack of information sharing and support.
While data access platforms tend to have slow, outdated systems for this type of support, Dewey offers a new approach. Our purpose is to make your life easier (at least when it comes to data collection!) and that includes making sure you have all the information you need about the data you’re considering using. Here’s a couple ways we help with that:
Clear Documentation: Dewey Docs is our user’s go-to-source for answering questions on the fly, including technical resources and detailed information on all our partners and their datasets.
Personalized Support: The Dewey team is specialized in academic data licensing, ready to offer real, human support whenever you need it.
Get in Touch with Us
Big data purchasing is a big decision. From our perspective, the above features are table stakes for supporting the modern academic researcher.
If you are not finding this to be true with the data you are accessing, we would love to help you on your research journey. Get in touch so we can understand the data you are looking for and we will do everything in our power to lighten the load.