__key__ stringlengths 28 30 | jp2 imagewidth (px) 512 512 |
|---|---|
compound-001/Plate1/AA15_s1_1 | |
compound-001/Plate1/AA15_s1_2 | |
compound-001/Plate1/AA15_s1_3 | |
compound-001/Plate1/AA15_s1_4 | |
compound-001/Plate1/AA15_s1_5 | |
compound-001/Plate1/AA15_s1_6 | |
compound-001/Plate1/AA16_s1_1 | |
compound-001/Plate1/AA16_s1_2 | |
compound-001/Plate1/AA16_s1_3 | |
compound-001/Plate1/AA16_s1_4 | |
compound-001/Plate1/AA16_s1_5 | |
compound-001/Plate1/AA16_s1_6 | |
compound-001/Plate1/AA18_s1_1 | |
compound-001/Plate1/AA18_s1_2 | |
compound-001/Plate1/AA18_s1_3 | |
compound-001/Plate1/AA18_s1_4 | |
compound-001/Plate1/AA18_s1_5 | |
compound-001/Plate1/AA18_s1_6 | |
compound-001/Plate1/AA20_s1_1 | |
compound-001/Plate1/AA20_s1_2 | |
compound-001/Plate1/AA20_s1_3 | |
compound-001/Plate1/AA20_s1_4 | |
compound-001/Plate1/AA20_s1_5 | |
compound-001/Plate1/AA20_s1_6 | |
compound-001/Plate1/AA25_s1_1 | |
compound-001/Plate1/AA25_s1_2 | |
compound-001/Plate1/AA25_s1_3 | |
compound-001/Plate1/AA25_s1_4 | |
compound-001/Plate1/AA25_s1_5 | |
compound-001/Plate1/AA25_s1_6 | |
compound-001/Plate1/AA35_s1_1 | |
compound-001/Plate1/AA35_s1_2 | |
compound-001/Plate1/AA35_s1_3 | |
compound-001/Plate1/AA35_s1_4 | |
compound-001/Plate1/AA35_s1_5 | |
compound-001/Plate1/AA35_s1_6 | |
compound-001/Plate1/AA38_s1_1 | |
compound-001/Plate1/AA38_s1_2 | |
compound-001/Plate1/AA38_s1_3 | |
compound-001/Plate1/AA38_s1_4 | |
compound-001/Plate1/AA38_s1_5 | |
compound-001/Plate1/AA38_s1_6 | |
compound-001/Plate1/AA41_s1_1 | |
compound-001/Plate1/AA41_s1_2 | |
compound-001/Plate1/AA41_s1_3 | |
compound-001/Plate1/AA41_s1_4 | |
compound-001/Plate1/AA41_s1_5 | |
compound-001/Plate1/AA41_s1_6 | |
compound-001/Plate1/AA42_s1_1 | |
compound-001/Plate1/AA42_s1_2 | |
compound-001/Plate1/AA42_s1_3 | |
compound-001/Plate1/AA42_s1_4 | |
compound-001/Plate1/AA42_s1_5 | |
compound-001/Plate1/AA42_s1_6 | |
compound-001/Plate1/AA43_s1_1 | |
compound-001/Plate1/AA43_s1_2 | |
compound-001/Plate1/AA43_s1_3 | |
compound-001/Plate1/AA43_s1_4 | |
compound-001/Plate1/AA43_s1_5 | |
compound-001/Plate1/AA43_s1_6 | |
compound-001/Plate1/AA44_s1_1 | |
compound-001/Plate1/AA44_s1_2 | |
compound-001/Plate1/AA44_s1_3 | |
compound-001/Plate1/AA44_s1_4 | |
compound-001/Plate1/AA44_s1_5 | |
compound-001/Plate1/AA44_s1_6 | |
compound-001/Plate1/AA47_s1_1 | |
compound-001/Plate1/AA47_s1_2 | |
compound-001/Plate1/AA47_s1_3 | |
compound-001/Plate1/AA47_s1_4 | |
compound-001/Plate1/AA47_s1_5 | |
compound-001/Plate1/AA47_s1_6 | |
compound-001/Plate1/AB08_s1_1 | |
compound-001/Plate1/AB08_s1_2 | |
compound-001/Plate1/AB08_s1_3 | |
compound-001/Plate1/AB08_s1_4 | |
compound-001/Plate1/AB08_s1_5 | |
compound-001/Plate1/AB08_s1_6 | |
compound-001/Plate1/AB14_s1_1 | |
compound-001/Plate1/AB14_s1_2 | |
compound-001/Plate1/AB14_s1_3 | |
compound-001/Plate1/AB14_s1_4 | |
compound-001/Plate1/AB14_s1_5 | |
compound-001/Plate1/AB14_s1_6 | |
compound-001/Plate1/AB17_s1_1 | |
compound-001/Plate1/AB17_s1_2 | |
compound-001/Plate1/AB17_s1_3 | |
compound-001/Plate1/AB17_s1_4 | |
compound-001/Plate1/AB17_s1_5 | |
compound-001/Plate1/AB17_s1_6 | |
compound-001/Plate1/AB22_s1_1 | |
compound-001/Plate1/AB22_s1_2 | |
compound-001/Plate1/AB22_s1_3 | |
compound-001/Plate1/AB22_s1_4 | |
compound-001/Plate1/AB22_s1_5 | |
compound-001/Plate1/AB22_s1_6 | |
compound-001/Plate1/AB26_s1_1 | |
compound-001/Plate1/AB26_s1_2 | |
compound-001/Plate1/AB26_s1_3 | |
compound-001/Plate1/AB26_s1_4 |
To accompany OpenPhenom, Recursion is releasing the RxRx3-core dataset, a challenge dataset in phenomics optimized for the research community. RxRx3-core includes labeled images of 735 genetic knockouts and 1,674 small-molecule perturbations drawn from the RxRx3 dataset, image embeddings computed with OpenPhenom, MAE-L/8, MAE-G/8, and associations between the included small molecules and genes. The dataset contains 6-channel Cell Painting images and associated embeddings from 222,601 wells but is less than 18Gb, making it incredibly accessible to the research community.
Mapping the mechanisms by which drugs exert their actions is an important challenge in advancing the use of high-dimensional biological data like phenomics. We are excited to release the first dataset of this scale probing concentration-response along with a benchmark and model to enable the research community to rapidly advance this space.
Paper published at LMRL Workshop at ICLR 2025 RxRx3-core: Benchmarking drug-target interactions in High-Content Microscopy.
Benchmarking code for this dataset is provided in the EFAAR benchmarking repo and Polaris.
Loading the RxRx3-core image dataset
from datasets import load_dataset
rxrx3_core = load_dataset("recursionpharma/rxrx3-core")
Loading OpenPhenom embeddings and metadata for RxRx3-core
from huggingface_hub import hf_hub_download
import pandas as pd
file_path_metadata = hf_hub_download("recursionpharma/rxrx3-core", filename="metadata_rxrx3_core.csv",repo_type="dataset")
file_path_embs = hf_hub_download("recursionpharma/rxrx3-core", filename="OpenPhenom_rxrx3_core_embeddings.parquet",repo_type="dataset")
open_phenom_embeddings = pd.read_parquet(file_path_embs)
rxrx3_core_metadata = pd.read_csv(file_path_metadata)
Metadata
The metadata can be found in metadata_rxrx3_core.csv in this repository. The schema of the metadata is as follows:
| Attribute | Description |
|---|---|
| well_id | Experiment Name - Plate - Well (compound-004_1_AA04 or gene-088_9_C15) |
| experiment_name | Experiment Name: Experiment number (compound-004 or gene-088) |
| plate | Plate number in the experiment (1-48) |
| address | Well location on the plate - "A01" to "AF48". |
| gene | Unblinded or anonymized gene name, or a control |
| treatment | Compound synonym or gene-name - guide-number (Narlaprevir or _guide_1) |
| SMILES | Canonical SMILES or blank for non-compounds |
| concentration | Compound concentration tested (in uM) |
| perturbation_type | CRISPR or COMPOUND |
| cell_type | HUVEC |
| well_type_label | Indicates experimental control information |
The well_type_label column includes the following values:
| well_type_label | Description |
|---|---|
| Query guides | CRISPR guides that target a query gene |
| Exon controls | Exon-targeting CRISPR guides that are used as controls |
| Intron controls | Intron-targeting CRISPR guides that are used as controls |
| Query Compounds + Intron control | Query compounds on an intron-targeting CRISPR background |
| CRISPR Gene Positive Controls | Control genes that are exon-targeting CRISPR guides that are used as controls, there are five genes with 6 guides each that target the exon region of the gene |
| Control Compounds + Intron control | Control compound on an intron-targeting CRISPR background |
To help understand the metadata, we have included some samples to enable parser testing and validation
well_id,experiment_name,plate,address,gene,treatment,SMILES,concentration,perturbation_type,cell_type,well_type_label
compound-001_10_AA12,compound-001,10,AA12,,Esomeprazole,"COC1=CC2=C([N-]C(=N2)[S@@](=O)CC2=C(C)C(OC)=C(C)C=N2)C=C1 |r,c:7,13,21,24,t:2,4,18|",2.5,COMPOUND,HUVEC,Control Compounds + Intron control
gene-077_3_L32,gene-077,3,L32,CENPC,CENPC_guide_3,,,CRISPR,HUVEC,Query guides
- Downloads last month
- 1,019