RAMP

txt


Collaborative crowdsourcing data challenge site for building and optimizing data science workflows for scientific and business applications.

The RAMP is a versatile management and software tool for connecting data science to domain sciences, which is the main mission of the Paris-Saclay Center for Data Science. It grew organically out of our experience with data challenges, and evolved through the dozen iterations that we carried out in our research and training activities. The RAMP is developed as an in-house tool at the CDS, in collaboration with the Center for Scientific Management (CGS) at Ecole des Mines. It was originally designed as a collaborative prototyping tool that makes efficient use of the time of data scientist in solving the data analytics segment of high-impact domain science problems. We then realized that it is equally valuable for training novice data scientists, for networking, for communication, and as a social science observatory. It has been rapidly becoming a standard educational tool, used in three UPSaclay data science masters, but also in other programs in Paris and Lille. It has been used six times at Saclay, and in four hackatons outside Saclay (Paris School of Economics; French National Museum of Natural History; NCAR, Colorado; Epidemium, Paris). The RAMP is used in the following operational context. Similarly to a data challenge, the data provider arrives with a prediction problem and a corresponding data set. An experienced data scientist then cleans and curates the data and formalizes the problem. This process can take two weeks to six months, and results in a starting kit, typically an ipython notebook that introduces the domain science problem, describes the data, and shows a first untuned solution (benchmark). The problem is then set up using the RAMP software, and a RAMP event is organized with 30-50 data scientists and domain scientists. The RAMP event usually takes a single day to attract data scientists who do not wish to engage for a longer period of time learning the domain problem. We have been experimenting with other formats: data challenges usually take several months, and course projects can take several weeks. When the data science problem requires the mastering of a specific tool, the RAMP event can be preceded by a Training Sprint for explaining specific tools to the participants. Part of the Training Sprint can also be devoted to introducing the domain science problem, otherwise this introduction takes place at the beginning of the RAMP. Collaborative prototyping During the RAMP, the participants submit predictive solutions (code) (as opposed to data challenges, where only predictions are submitted). The models are trained on our back-end. The scores are displayed on a leaderboard. All participants have access to all code, and they are encouraged to look at and to reuse each other's solutions. This accelerates the development process (compared to challenges) since good ideas spread fast. The original single day setup was tested on the HiggsML challenge (particle physics), on mortality prediction (health care), variable star classification (astrophysics), El Nino prediction (climate science), insect recognition (ecology), and replacing agent-based simulations (macroeconomy). Each of these events lead to a significant improvement over the baseline. Since the organizers have access to all the code, the result of the day is a fully functioning near-optimal prototype. Training. It became clear from the beginning that the tool had great value for hands-on training of data scientists. About half of our participants attended the RAMPs for learning data science. We had a novice participant with no background in data science, finding a data science job after attending our first four RAMPs. We have been using RAMPs in continuing education (50 students), and we ran a Data Camp M2 course which was part of three UPSaclay M2 programs (70 students): Data Science (math), AIC and Data and Knowledge (computer science). The RAMP was also used in a course on Machine Learning for Finance and Economics at Université Panthéon-Assas, and in a graduate course in the Data analysis and decision} program at Ecole Centrale de Lille. Networking. Each RAMP attracts about 30-50 participants, coming from different backgrounds and carrier stages, who usually meet for the first time. They develop a working relationship in a relaxed environment, and sometimes keep working together after the event. Communication. The RAMPs have started to receive increased attention from outside the Université Paris-Saclay. We collaborated with the SPIPOLL project of the French National Museum of Natural History, and the event got blogged by one of the participating start-ups. We were invited to the fifth Climate Informatics workshop in Boulder, Colorado, to re-run our El Nino RAMP. We collaborated with the Paris School of Economics on our macroeconomy RAMP. Finally, Epidemium recently organized a hackaton using our tool for cancer rate prediction. Social science observatory. The RAMPs generate a significant amount of quantitative data on the way data scientists work and collaborate with each other, which allows social scientists to study the dynamics of collaborative work. The results of these analyses can then be used to optimize certain aspects of the RAMP format.

Learn more at ramp.studio


Login to Leave A Comment

Discover, Learn and Evaluate AI Companies and Solutions

Save content to your library

Save case studies, articles, blog posts and more. Curate your research library with content directly from AI companies.

Login with LinkedIn Login with Twitter

Sign in with Email

The latest updates from AI companies in your industry

Get a weekly newsletter with the latest posts directly from the AI companies. Follow companies to tailor your feed.

LATEST | POPULAR
DEMAND FORECASTING

Experience the leading AI demand forecasting platform.

SPELL - HYPERPARAMETER SEARCHES

Optimize hyperparameters to improve the accuracy of your model with the Spell Hyper command

SPELL - JUPYTER NOTEBOOK SERVER - GPU POWERED

Collaborative Jupyter Notebook or JupyterLab workspace server with powerful GPUs

SPELL - DEEP LEARNING PLATFORM

The fastest and most powerful end-to-end platform for machine learning and deep learning.

INZATA ANALYTICS

AI-Powered Analytics Software for Any Data

CLUEY TRAVEL

Discover your perfect travel destination

CLUEY TRAVEL CHATBOT

Find your perfect travel destination

EXXACT DEEP LEARNING WORKSTATIONS AND SERVERS

DEEP LEARNING NVIDIA GPU SOLUTIONS: Accelerate your AI research today.

BIZON G5000

Deep Learning Workstation

BIZON G7000

Deep Learning Rackmount Server

Make sure your business and career keeps up with the changing world.
Sign up