GithubHelp home page GithubHelp logo

fabiansinz / cadwell2020 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from crcadwell/cadwell2020

0.0 0.0 0.0 33.46 MB

Code for analyses reported in Cadwell et al., 2020

License: Apache License 2.0

MATLAB 7.40% Rich Text Format 2.56% Jupyter Notebook 87.93% Python 2.06% Dockerfile 0.05% Shell 0.01%

cadwell2020's Introduction

Overview

This repository contains the Matlab, R and Python code used to analyze data and generate figures in Cadwell et al., Cell type composition and circuit organization of clonally related excitatory neurons in the juvenile mouse neocortex, eLife (2020).

License

Copyright 2020 C. R. Cadwell

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

General Organization

For quantitative analysis of clones and connectivity data, we used the Matlab implementation of DataJoint, which utilizes a relational database model for organizing, populating, and querying data. The DataJoint schemas are archived at atlab/commons and the critical tables for our analyses are described below.

Analysis of gene expression data was performed in R Bioconductor using custom software and previously developed packages including scran.

Modeling of the cortical circuit was done in Python using Jupyter Notebook.

For efficiency, the data were stored at intermediated stages of analysis as .mat, .txt, or .csv files.

Files related to Figure 1

Reconstruction of clones across slices (analyses used for Figure 1)

Tile scan Z-stacks of entire coronal sections were first maximially projected using the commercial acquisition software for the microscope. The positions of labeled cells were annotated using the following Matlab-based custom software:

  • Segmentation.m This code selects one maximally projected coronal section at a time, and has the user manually outline the contours of the cortex, and mark the positions of cortical neurons by presenting small patches of the cortex area. The positions of annotated cortical neurons are saved to a separate file.
  • showImages.m This code shows all annotated coronal sections for an entire mouse brain, including the outlines of the cortex and positions of the neurons identified above. The user can scroll through the slices to see how individual clones appear on adjacent sections. These images are aligned manually across slices to visualize reconstructed clones shown in Figure 1B,C and Figure 1-supplement 1A,B.

Quantification of clones at P10 and E12.5

  • CountCells.m While active, this code will count the number of annotated neurons within an area selected by the user, while viewing an annotated coronal section.
  • CloneQuantificaiton.mat Saved variables used to generate Figure 1D-F and Figure 1-supplement 1C,D.
  • Figure1D-J S1CD.m Code to generate Figure 1D-F,I,J and Figure 1-supplement 1C,D

Files related to Figure 2

Quality control of single-cell RNA-seq data, visualization using t-SNE, and generalized linear models to predict layer or region from gene expression data

  • Figure2C-H S1 S2.rtf Code to run in R Bioconductor to generate panels for Figure 2C-H, Figure 2-supplement 1 and Figure 2-supplement 2.
  • countdata.txt, samplelist.txt, genelist.txt, and annotations.txt Input data needed to run ananlysis and generate figure panels using above script in R.
  • All other files in this folder are final or intermediate outputs of the above R script.

Files related to Figure 3

t-SNE projection of Patch-seq data onto reference atlas and transcriptomic cluster assignment of each cell

  • rnaseqTools.py Useful functions for RNA-seq analysis used copied from other repositories of dkobak.
  • microcolumns.ipynb Python notebook for t-SNE projection and transcipitomic cluster assignment.

data.mat

Contains the raw data (normalized logcounts,counts) and metadata for each of the 206 samples included in our Patch-seq dataset. Metadata includes the following pieces of information about each cell:

  • exp Experiment number.
  • fp Firing pattern.
  • genes Gene names for the data included in counts.
  • label Indicates whether the cell was labeled by a fluroscent indicator (positive) or not (negative).
  • layer Layer position of the cell.
  • region Brain region, if known (V1 = primary visual cortex, SS1 = primary somatosensory cortex).
  • sample Unique sample ID.
  • slice Slice number (numbering restarted for each animal).
  • subject Unique animal ID.

allenData.mat

Contains our t-SNE projection data for the reference dataset from Tasic et al.2018.

  • allentsne Contains the x and y t-SNE corrdinates for each cell in the reference atlas. The third column is the cluster ID.
  • allentsneNames Names of cell clusters for each cell in the reference atlas.
  • allentsneColor RGB values for each cell in the reference atlas.

columnProjection.mat

Contains the t-SNE projection data for mapping our Patch-seq dataset onto the reference atlas.

  • cProj Contains the x and y t-SNE coordinates for each cell in our Patch-seq dataset. The third column is a measure of uncertainty of the mapping (see Methods section of paper for how this is computed, larger values indicate greater uncertainty).

classAssignments.mat

Shows the best matching transcriptomic cluster in the reference atlas for each cell in our Patch-seq dataset. Cluster names and cluster IDs are the same as those used in Tasic et al., 2018.

  • class Name of the best-matched transcriptomic cluster for each Patch-seq cell.
  • classID Cluster ID of the best-matched transcriptomic cluster for each Patch-seq cell.

Figure3.m

Script for generating figure panels in Figure 3, Figure 3-supplement 1, and Figure 3-supplement 2.

Files related to Figures 4 and 5 and Table 1

Analysis of layer-specific connectivity rates (Figures 4 and 5)

  • Sort.m Code for sorting connectivity data into layer-specific groups, a 3x3 matrix representing each layer combination.
  • Groups.mat Connections sorted into layer-specific groups.
  • allCounts.mat Summary of number of connections, with a 3x3 matric for each layer combination in each of the following categories:
    • biConnR: Related pairs with bidirectional connections.
    • biConnU: Unrelated pairs with bidirectional connections.
    • biUnconnR: Related pairs without bidirenctional connections.
    • biUnconnU: Unrelated pairs without bidirectional connections.
    • connR: Related pairs with connection.
    • connU: Unrelated pairs with connection.
    • unconnR: Related pairs without connection.
    • unconnU: Unrelated pairs without connection.

Simple model of connectivity (Figure 4G and 5E)

You need docker and docker-compose installed. All files are in the folder Connectivity.

  • Run docker-compose up. This will start a jupyter notebook server and a mysql server in docker files.
  • Open your browser and go to localhost:8888.
  • Open the notebook Main.ipynb in the browser and execute all cells (SHIFT-ENTER).
  • This will generate the file expected_input_2019-06-04_FS.csv in the data directory.
  • expected_input_2019-06-04_FS.csv Output of model using the parameters used for the paper.

Analysis of connectivity using distance-matched controls (Figure 5 - supplement 2)

  • Resample.m Code for generating resampled data.
  • ResampledData.mat Resampled data generated using Resample.m.
  • TwoSided.m Code to generate two-sided p-values for resampled data.
  • TwoSided.mat Two sided p-values generated using TwoSided.m.

Power analysis (Figure 5 - Supplement 1)

  • PowerAnalysis.rtf Code used for power analysis in R.
  • Power, FoldChange, and Prl are output of 'PowerAnalysis.rtf`.

Figure panels

  • Figures4DEFG5CDEFS2S3.m Code to generate Figure 4D-G, Figure 5C-F, Figure 5-supplement 2, Figure 5-supplement 3, and generalized linear model shown in Table 1.
  • Figure5S1 Code to generate Figure 5 - supplement 1.

Util

Custom-written functions used by multiple files above:

ChiSquared.m

Computes the Chi-squared test statistic and p-value.

DataJoint database structure

Schema mc

mcSchema

Detailed table definitions can be found at atlab/commons

cadwell2020's People

Contributors

crcadwell avatar fabiansinz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.