Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
bc22f82
Implement create_slice_grid WIP
constantinpape May 24, 2022
d247785
Refactor create_slice_grid functionality WIP
constantinpape May 24, 2022
0f33325
Complete implementation of create_slice_grid (untested)
constantinpape May 25, 2022
53d3d0a
Fix several issues in create_slice_grid
constantinpape May 25, 2022
8e12747
Refactor source metadata functionality
constantinpape May 28, 2022
f46cf6f
More refactoring
constantinpape May 28, 2022
eb45e0f
Refactor view functionality
constantinpape May 28, 2022
10f4cef
Update example notebook
constantinpape May 28, 2022
2b55d3a
Refactor slice grid view
constantinpape May 28, 2022
72621fe
Update example notebooks WIP
constantinpape May 28, 2022
5752b0c
Update citation info
constantinpape May 31, 2022
361a8fa
Start working on htm project creation example
constantinpape May 31, 2022
bece248
Move slice grid view functionality to experimental
constantinpape May 31, 2022
0b83b19
Work on htm example notebook
constantinpape Jun 1, 2022
495aa3d
Rename source_annotation to region also in code
constantinpape Jun 2, 2022
a1afdd7
Check that columns given in colorByColumn actually exist for regionDi…
constantinpape Jun 2, 2022
17f7b09
Check colorByColumn for segmentationDisplays
constantinpape Jun 2, 2022
89a6c55
More renaming of annotation to region
constantinpape Jun 2, 2022
61c3ab5
Add util functionality for computing clims for grid views
constantinpape Jun 2, 2022
023760f
Update htm example
constantinpape Jun 2, 2022
3ed5992
Update readmes
constantinpape Jun 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,10 @@ $ pip install -e .

## Usage

The library contains functionality to generate MoBIE projects and add data to it.
Check out [the example notebook](https://github.com/mobie/mobie-utils-python/blob/master/examples/create_mobie_project.ipynb) to see how to generate a MoBIE project.
The library contains functionality to generate MoBIE projects, add data to it and create complex views.
For complete examples, please check out the [examples](https://github.com/mobie/mobie-utils-python/blob/master/examples):
- [normal project creation](https://github.com/mobie/mobie-utils-python/blob/master/examples/create_mobie_project.ipynb): generate a MoBIE project for multi-modal data from a CLEM experiment
- [htm project creation](https://github.com/mobie/mobie-utils-python/blob/master/examples/create_mobie_htm_project.ipynb): generate a MoBIE project for high-throughput microscopy from a imaging based SARS-CoV-2 antibody assay.

Below is a short code snippet that shows how to use it in a python script.

Expand Down Expand Up @@ -74,4 +76,4 @@ Run `<COMMAND-NAME> --help` to get more information on how to use them.

## Citation

If you use the MoBIE framework in your research, please cite [Whole-body integration of gene expression and single-cell morphology](https://www.biorxiv.org/content/10.1101/2020.02.26.961037v1).
If you use the MoBIE framework in your research, please cite [the MoBIE bioRxiv preprint](https://www.biorxiv.org/content/10.1101/2022.05.27.493763v1).
6 changes: 6 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# MoBIE python examples

This folder contains notebooks that demonstrate the usage of the MoBIE python library.
Currently, we have the following two example notebooks:
- [create_mobie_project](https://github.com/mobie/mobie-utils-python/blob/master/examples/create_mobie_project.ipynb): generate a MoBIE project for multi-modal data from a CLEM experiment
- [create_mobie_htm_project](https://github.com/mobie/mobie-utils-python/blob/master/examples/create_mobie_htm_project.ipynb): generate a MoBIE project for high-throughput microscopy from a imaging based SARS-CoV-2 antibody assay.
332 changes: 332 additions & 0 deletions examples/create_mobie_htm_project.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,332 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6da32cff",
"metadata": {},
"source": [
"# Create MoBIE HTM Project\n",
"\n",
"Create a MoBIE project for high-throughput-microscopy data. The test data for this example is available here: https://owncloud.gwdg.de/index.php/s/eu8JMlUFZ82ccHT. It contains 3 wells of a plate from a immunofluorescence based SARS-CoV-2 antibody assay from https://onlinelibrary.wiley.com/doi/full/10.1002/bies.202000257."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38524f13",
"metadata": {},
"outputs": [],
"source": [
"# general imports\n",
"import os\n",
"import string\n",
"from glob import glob\n",
"\n",
"import mobie\n",
"import mobie.htm as htm\n",
"import pandas as pd\n",
"\n",
"# the location of the data\n",
"# adapt these paths to your system and the input data you are using\n",
"\n",
"# location of the input data. \n",
"# the example data used in this notebook is available via this link:\n",
"# https://oc.embl.de/index.php/s/IV1709ZlcUB1k99\n",
"example_input_folder = \"/home/pape/Work/data/htm-test-data\"\n",
"\n",
"# the location of the mobie project that will be created\n",
"# we recommend that the mobie project folders have the structure <PROECJT_ROOT_FOLDER/data>\n",
"# the folder 'data' will contain the sub-folders for individual datasets\n",
"mobie_project_folder = \"/home/pape/Work/data/mobie/mobie_htm_project/data\"\n",
"\n",
"# name of the dataset that will be created.\n",
"# one project can contain multiple datasets\n",
"dataset_name = \"example-dataset\"\n",
"dataset_folder = os.path.join(mobie_project_folder, dataset_name)\n",
"\n",
"# the platform and number of jobs used for computation.\n",
"# choose 'local' to run computations on your machine.\n",
"# for large data, it is also possible to run computation on a cluster;\n",
"# for this purpose 'slurm' (for slurm cluster) and 'lsf' (for lsf cluster) are currently supported\n",
"target = \"local\"\n",
"max_jobs = 4"
]
},
{
"cell_type": "markdown",
"id": "fe5a2a13",
"metadata": {},
"source": [
"## Adding image data\n",
"\n",
"First, we add all the image data for the 3 wells. Here, we have 3 channels:\n",
"- `serum`: showing the measured immunofluorescence of the human serum\n",
"- `marker`: showing a marker channel for viral RNA\n",
"- `nuclei`: showing the nuclei stained with DAPI\n",
"\n",
"The function `htm.add_images` will add sources to the dataset metadata for all `input_files` that are passed.\n",
"It **will not** add corresponding views to show the individual images. Instead, we will add a grid view below that recreates the plate layout and where all image (and segmentation) sources can be toggled on and off."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aeb0deb1",
"metadata": {},
"outputs": [],
"source": [
"# the individual images are stored as h5 files in the folder with the example data.\n",
"# each hdf5 file contains multiple datasets, each corresponding to a different image channel (or segmentation)\n",
"input_files = glob(os.path.join(example_input_folder, \"*.h5\"))\n",
"input_files.sort()\n",
"\n",
"# the resolution in micron for this data, as well as the downscaling factors and chunks to be used in the data conversion\n",
"resolution = [0.65, 0.65]\n",
"scale_factors = 4 * [[2, 2]]\n",
"chunks = [512, 512]\n",
"\n",
"# the 3 image channels (each stored as dataset in the h5 file corresponding to the site)\n",
"channels = [\"serum\", \"marker\", \"nuclei\"]\n",
"for channel_name in channels:\n",
" # image_names determines the names for the corresponding image sources in MoBIE\n",
" image_names = [os.path.splitext(os.path.basename(im))[0] for im in input_files]\n",
" image_names = [f\"{channel_name}-{name}\" for name in image_names]\n",
"\n",
" htm.add_images(input_files, mobie_project_folder, dataset_name,\n",
" image_names, resolution, scale_factors, chunks, key=channel_name,\n",
" target=target, max_jobs=max_jobs, file_format=\"ome.zarr\")"
]
},
{
"cell_type": "markdown",
"id": "dbfe819f",
"metadata": {},
"source": [
"## Add segmentation data\n",
"\n",
"Next, we add the segmentation data. Here, we have 2 segmentations per site:\n",
"- `cells`: the segmentation of individual cells\n",
"- `nuclei`: the segmentation of individual nuclei\n",
"\n",
"`htm.add_segmentations` works very similar to `htm.add_images`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49d15bd5",
"metadata": {},
"outputs": [],
"source": [
"segmentation_names = [\"cells\", \"nuclei\"]\n",
"for seg_name in segmentation_names:\n",
" image_names = [os.path.splitext(os.path.basename(im))[0] for im in input_files]\n",
" image_names = [f\"segmentation-{seg_name}-{name}\" for name in image_names]\n",
" \n",
" htm.add_segmentations(input_files, mobie_project_folder, dataset_name,\n",
" image_names, resolution, scale_factors, chunks, key=f\"segmentation/{seg_name}\",\n",
" target=target, max_jobs=max_jobs, file_format=\"ome.zarr\")"
]
},
{
"cell_type": "markdown",
"id": "5d7e7a6f",
"metadata": {},
"source": [
"## Add views to create plate layout\n",
"\n",
"Finally, we create the view with the plate layout and data, using MoBIE `grid` transformations and `regionDisplays`.\n",
"In addition to the layout, we can also add tables associated with wells, or with individual sites (=image positions). Here, we can use the example table for our test data from: https://owncloud.gwdg.de/index.php/s/m1ILROJc7Chnu9h"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79793e3e",
"metadata": {},
"outputs": [],
"source": [
"# first, we need to define function that translate source names to site names, site_names to well names and \n",
"# that return the 2d grid position for a given well\n",
"\n",
"\n",
"# extract the site name (= Well name and position in well for an image)\n",
"# here, the site name comes in the source name after the source prefix, i.e.\n",
"# source_name = f\"{prefix}_{site_name}\"\n",
"def to_site_name(source_name, prefix):\n",
" return source_name[(len(prefix) + 1):]\n",
"\n",
"\n",
"# extract the well name from the site name.\n",
"# here, the site name consists of well name and position in the well, i.e.\n",
"# source_name = f\"{well_name}_{position_in_well}\"\n",
"def to_well_name(site_name):\n",
" return site_name.split(\"_\")[0]\n",
"\n",
"\n",
"# map the well name to its position in the 2d grid\n",
"# here, the Wells are called C01, C02, etc.\n",
"def to_position(well_name):\n",
" r,c = well_name[0], well_name[1:]\n",
" r = string.ascii_uppercase.index(r)\n",
" c = int(c) - 1\n",
" return [c, r]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa6e9438",
"metadata": {},
"outputs": [],
"source": [
"# all our source prefixes (= image channel / segmentation names)\n",
"# and the corresponding source types\n",
"source_prefixes = [\"nuclei\", \"serum\", \"marker\", \"segmentation-cells\", \"segmentation-nuclei\"]\n",
"source_types = [\"image\", \"image\", \"image\", \"segmentation\", \"segmentation\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "031d6629",
"metadata": {},
"outputs": [],
"source": [
"# compute the contrast limits for the image channels\n",
"# (this is not strictly necessaty, but usually very beneficial for htm data to obtain a reasonable visualization of the data)\n",
"clims_nuclei = htm.compute_contrast_limits(\"nuclei\", dataset_folder, lower_percentile=4, upper_percentile=96, n_threads=max_jobs)\n",
"clims_serum = htm.compute_contrast_limits(\"serum\", dataset_folder, lower_percentile=4, upper_percentile=96, n_threads=max_jobs)\n",
"clims_marker = htm.compute_contrast_limits(\"marker\", dataset_folder, lower_percentile=4, upper_percentile=96, n_threads=max_jobs)\n",
"\n",
"# specifiy the settings for all the sources\n",
"source_settings = [ \n",
" # nucleus channel: color blue\n",
" {\"color\": \"blue\", \"contrastLimits\": clims_nuclei, \"visible\": True},\n",
" # serum channel: color green\n",
" {\"color\": \"green\", \"contrastLimits\": clims_serum, \"visible\": False},\n",
" # marker channel: color red\n",
" {\"color\": \"red\", \"contrastLimits\": clims_marker, \"visible\": False},\n",
" # the settings for the 2 segmentations\n",
" {\"lut\": \"glasbey\", \"tables\": [\"default.tsv\"], \"visible\": False, \"showTable\": False},\n",
" {\"lut\": \"glasbey\", \"tables\": [\"default.tsv\"], \"visible\": False, \"showTable\": False},\n",
"] "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "85ea862d",
"metadata": {},
"outputs": [],
"source": [
"# create table for the sites (individual images)\n",
"\n",
"# adapt this to where this path is on your system\n",
"site_table_path = \"/home/pape/Work/data/htm-test-data/site-table.tsv\"\n",
"table = pd.read_csv(site_table_path, sep=\"\\t\")\n",
"\n",
"# the tables should be saved in a path relative to the dataset root folder,\n",
"# and are usually stored in the subfolder 'tables', with another sub-folder for each source or view with table(s)\n",
"table_out_path = os.path.join(dataset_folder, \"tables\", \"sites\")\n",
"os.makedirs(table_out_path, exist_ok=True)\n",
"table_out_path = os.path.join(table_out_path, \"default.tsv\")\n",
"\n",
"# we need to rename the site name from its representation in the table (C01-0001) to our representation (C01-1)\n",
"def rename_site(site_name):\n",
" well, image_id = site_name.split(\"-\")\n",
" image_id = int(image_id)\n",
" return f\"{well}_{image_id}\"\n",
"\n",
"table[\"sites\"] = table[\"sites\"].apply(rename_site)\n",
"\n",
"# the first column in tables for a MoBIE region display (which is used internally by the grid view)\n",
"# has to be called \"regionId\"\n",
"table = table.rename(columns={\"sites\": \"region_id\"})\n",
"table.to_csv(table_out_path, sep=\"\\t\", index=False)\n",
"print(table)\n",
"\n",
"# this is the relative path to the table folder, in relation to the dataset folder\n",
"site_table_folder = os.path.split(os.path.relpath(table_out_path, dataset_folder))[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "829020cf",
"metadata": {},
"outputs": [],
"source": [
"# we can also create a table for the wells; the procedure here is similar to the case where the images were added\n",
"\n",
"well_table_path = \"/home/pape/Work/data/htm-test-data/well-table.tsv\"\n",
"table = pd.read_csv(well_table_path, sep=\"\\t\")\n",
"\n",
"table_out_path = os.path.join(dataset_folder, \"tables\", \"wells\")\n",
"os.makedirs(table_out_path, exist_ok=True)\n",
"table_out_path = os.path.join(table_out_path, \"default.tsv\")\n",
"\n",
"table = table.rename(columns={\"wells\": \"region_id\"})\n",
"table.to_csv(table_out_path, sep=\"\\t\", index=False)\n",
"print(table)\n",
"\n",
"well_table_folder = os.path.split(os.path.relpath(table_out_path, dataset_folder))[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0b36b9b",
"metadata": {},
"outputs": [],
"source": [
"# crate the plate grid view\n",
"dataset_folder = os.path.join(mobie_project_folder, dataset_name)\n",
"htm.add_plate_grid_view(dataset_folder, view_name=\"default\",\n",
" source_prefixes=source_prefixes, source_types=source_types, source_settings=source_settings,\n",
" source_name_to_site_name=to_site_name, site_name_to_well_name=to_well_name,\n",
" well_to_position=to_position, site_table=site_table_folder, well_table=well_table_folder,\n",
" sites_visible=False, menu_name=\"bookmark\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1901c4d1",
"metadata": {},
"outputs": [],
"source": [
"mobie.validation.validate_project(mobie_project_folder)"
]
},
{
"cell_type": "markdown",
"id": "7d685231",
"metadata": {},
"source": [
"For adding the necessary metadata to share the project via s3, and options for uploading it to s3, please check out the last cells of the `ceate_mobie_project.ipynb` notebook (in the same folder on github as this one)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading