Data

Clustering Benchmarks

About the Dataset

This dataset has been uniformly randomly  sampled from the Playdrone dataset collected by Viennot et al. at the University of Columbia.

A JSON file containing the metadata and URLs for all apps in the snapshot.

The documentation of how to parse the JSON file.

Click the icon below to download a csv file containing the Android app IDs and the index of the app in the aforementioned JSON file.

Download

csv_icon

 

 


Clustering 2016

Before Download

We publicly make the app stores snapshots used in our studies available. However, if you plan to use these datasets, please cite our work.

How to reference us:

Download citation as BibTex or Plain Text.

About the Dataset

This dataset has been collected in August 2014 from both BlackBerry App World and Google Play Apps stores. It contains the metadata of each app (not the app’s executables) including raw description and category. The total number of apps is 14,258 belonging to 16 different categories in Blackberry App World and 3,619 apps from 23 high-level categories in the Google Play App store.

More details about the datasets can be found in our paper.

If you need any other data that has been mentioned in the paper (i.e., any stage of the algorithm) do contact us and we shall make it available to you.

Download

The dataset is available as csv files. If you need it in any other format, please contact us.

csv_icon
Blackberry_35_2014.csv
csv_icon
Google_31_2014.csv