-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implemented find_clones and added API.md readme
- Loading branch information
Andrew Gartland
committed
Sep 22, 2017
1 parent
7bdc586
commit d3ffbef
Showing
8 changed files
with
573 additions
and
79 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# API branch | ||
--- | ||
|
||
|
||
# Single TCR example | ||
Here's an example of how you can process a single nucleotide sequence: | ||
|
||
```python | ||
betaNT = 'CGGGGGGGGTACCNTTGNTTAGGTCCTCTACACGGTTAACCTGGTCCCCGAACCGAAGGTCAATAGGGCCTGTATACTGCTGGCACAGAAGTACACAGCTGAGTCCCTGGGTTCTGAGGGCTGGATCTTCAGAGTGGAGTCANN' | ||
|
||
betaQuals = '12.12.12.12.12.22.9.8.6.6.6.8.3.0.3.10.3.0.3.10.10.11.20.25.30.37.37.29.27.14.14.15.27.30.41.47.36.50.50.50.42.42.57.57.43.47.53.47.47.47.47.47.47.50.54.57.57.57.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.57.57.57.57.59.59.59.57.57.57.57.57.57.57.57.59.57.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.59.59.59.59.59.57.57.57.59.57.57.43.37.28.28.21.28.23.37.28.30.15.19.17.15.21.20.25.3.0.0' | ||
|
||
import tcrdist as td | ||
chain = td.processing.processNT('human', 'B', betaNT, betaQuals) | ||
``` | ||
|
||
# Pipeline example | ||
Here's an example of a TCR pipeline with `tcrdist`: | ||
|
||
```python | ||
import tcrdist as td | ||
psDf = td.datasets.loadPSData('human_pairseqs_v1') | ||
probDf = td.processing.computeProbs(psDf) | ||
psDf = psDf.join(probDf) | ||
clonesDf = td.processing.identifyClones(psDf) | ||
``` | ||
|
||
# Testing | ||
To run all tests: | ||
|
||
`py.test tcrdist` | ||
|
||
or to pass all print statements through for debugging: | ||
|
||
`py.test tcrdist -s` | ||
|
||
Note that importing and running functions in `tcrdist` generates a log file, `tcrdist.log`. You can set the logging `level` by modifying the parameter in the base `__init__.py` file using one of the following: | ||
|
||
``` | ||
logging.ERROR | ||
logging.WARNING | ||
logging.INFO | ||
logging.DEBUG | ||
``` | ||
|
||
For details see documentation on the `logging` [module](https://docs.python.org/2/library/logging.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
import os.path as op | ||
import logging | ||
import inspect | ||
|
||
from . import processing | ||
|
||
logger = logging.getLogger('datasets.py') | ||
|
||
def loadPSData(name=None): | ||
"""Used to load paired sequence data without needing to be aware of the full path. | ||
Call without parameters to see a list of datasets available.""" | ||
|
||
datasets = dict(human_pairseqs_v1='human_pairseqs_v1.tsv', | ||
mouse_pairseqs_v1='mouse_pairseqs_v1.tsv', | ||
test_human_pairseqs='test_human_pairseqs.tsv', | ||
test_mouse_pairseqs='test_mouse_pairseqs.tsv') | ||
|
||
if name is None or name == '': | ||
print('Available datsets:') | ||
for k in datasets.keys(): | ||
print(' %s' % k) | ||
return | ||
|
||
try: | ||
fn = datasets[name] | ||
except KeyError: | ||
logger.error('Could not find dataset: %s' % name) | ||
|
||
if fn[-3:] == 'tsv': | ||
delimiter = '\t' | ||
else: | ||
delimiter = ',' | ||
|
||
datasetsPath = op.join(op.split(inspect.stack()[0][1])[0], 'datasets') | ||
filename = op.join(datasetsPath, fn) | ||
|
||
print('Reading from %s' % filename) | ||
|
||
if 'mouse' in fn: | ||
organism = 'mouse' | ||
else: | ||
organism = 'human' | ||
|
||
psDf = processing.readPairedSequences(organism, filename) | ||
return psDf | ||
|
||
|
||
|
Oops, something went wrong.