Skip to content

Commit

Permalink
updating docstrings
Browse files Browse the repository at this point in the history
  • Loading branch information
Erik Lee committed Jun 25, 2020
1 parent e8c745b commit 6bd9bae
Show file tree
Hide file tree
Showing 10 changed files with 150 additions and 1,114 deletions.
Binary file removed __pycache__/__init__.cpython-36.pyc
Binary file not shown.
Binary file modified discovery_imaging_utils/__pycache__/__init__.cpython-36.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified discovery_imaging_utils/__pycache__/nifti_utils.cpython-36.pyc
Binary file not shown.
Binary file not shown.
135 changes: 110 additions & 25 deletions discovery_imaging_utils/denoise_ts_dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,103 @@

def denoise(parc_dict, hpf_before_regression, scrub_criteria_dictionary, interpolation_method, noise_comps_dict, clean_comps_dict, high_pass, low_pass):

"""A function to denoise a dictionary with parcellated fMRI data.
Function to denoise resting-state fMRI data. DOCUMENATION UNDERGOING TRANSITION...
Using the tools in *parc_ts_dictionary*, a dictionary is generated containing all of the items that would be conventionally used in denoising the fMRI time signal. This dictionary can then be used to generated denoised data with the *denoise* function within *denoise_ts_dict*. The *denoise* function is seen as follows:
.. code-block:: python
denoise(parc_dict,
hpf_before_regression,
scrub_criteria_dictionary,
interpolation_method,
noise_comps_dict,
clean_comps_dict,
high_pass,
low_pass)
Parameters
----------
parc_dict : dict
This is a dictionary that has presumably been generated by parc_ts_dictionary.
hpf_before_regression : bool or float
specifies whether or not the nuisance regressors and the time-signals of interest (i.e. the parcellated time-signals) are filtered before the nuisance regressors are regressed from the parcellated time-signals. If you do not want to do this, set to False. Otherwise set to the desired high-pass cutoff point.
scrub_criteria_dictionary : dict or bool
This argument allows the user to define how scrubbing should be conducted. If you do not want to do scrubbing, set this argument to False. If you want to do scrubbing, there are a few different configuration options for the dictionary
(1) {'std_dvars' : 1.2, 'framewise_displacement' : 0.5}
(2) {'Uniform' : [0.8, ['std_dvars', 'framewise_displacement']]}
In the first example, any timepoints with std_dvars > 1.2, and framewise_displacement > 0.5 will be scrubbed. Any number of variables found under the confounds dictionary can be used to do scrubbing, with any cutoff point. In the second example, Uniform scrubbing is specified, with 0.8 meaning that the best 80% of timepoints should be kept. The sub-list with std_dvars and framewise_displacement says that std_dvars and framewise_displacement should be used to determine what the best volumes are. This means the two metrics will be demeaned and variance normalized and then added together, and the 80% of volumes with the lowest score on the combined metric will be kept. Any variables under the confounds dictionary (that are not groupings such as motion_regs_24) can be used to construct these dictionaries.
interpolation_method : str
Can choose between 'linear', 'cubic_spline', and 'spectral'. The spectral denoising takes the longest but is expected to perform the best (this is based off of the technique presented in Power's 2014 NeuroImage paper/Anish Mitra's work)
noise_comps_dict : dict or bool
this dictionary configures what nuisance signals will be removed from the parcellated timeseries. Each element represents an entry to the confounds dictionary, where the key is the name of the confound (or group of confounds) to be regressed, and the entry is either False or an integer, which specifies whether the nuisance regressors should be reduced by PCA and if so how many principal components should be kept. Some examples are seen below:
.. code-block:: python
#Include 24 motion parameters as regressors
denoise_dict = {'motion_regs_twentyfour' : False}
#Include 24 motion parameters as regressors, reduced through PCA to 10 regressors
denoise_dict = {'motion_regs_twentyfour' : 10}
#Include WM/CSF/GSR + motion parameters as regressors
denoise_dict = {'wmcsfgsr' : False, 'motion_regs_twentyfour' : False}
#Include WM/CSF/GSR + ICA-AROMA Noise Timeseries as regressors
denoise_dict = {'wmcsfgsr' : False, 'aroma_noise_ics' : False}
#Skip nuisance regression
denoise_dict = False
* clean_comps_dict: The formatting of this dictionary is identical to the noise_comps_dict, but this dictionary is used for specifying components whose variance you do not want to be removed from the parcellated timeseries. During the denoising process a linear model will be fit to the parcellated time-series using both the signals specified by the noise_comps_dict and clean_comps_dict, but only the signal explained by the noise_comps_dict will be removed.
* high_pass: The cutoff frequency for the high-pass filter to be used in denoising. If you want to skip the high-pass filter, set to False.
* low_pass: The cutoff frequency for the low-pass filter to be used in denoising. If you want tot skip the low-pass filter, set to False.
Running the function will output a dictionary containing the cleaned parcellated signal along with the settings used for denoising, other QC variables, and variables copied from the input dictionary. This includes:
* cleaned_timeseries: The cleaned signal after denoising with shape <n_regions, n_timepoints>. Any scrubbed timepoints, or timepoints removed at beginning of the scan will be NaN
* denoising_settings.json: The settings specified when using the *denoise* function
* dvars_pre_cleaning: DVARS calculated pre-cleaning on all input parcels (timepoints skipped at the beginning of the run + the next timepoint after the initial skipped timepoints will have DVARS set to -0.001)
* dvars_post_cleaning: DVARS calculated post-cleaning on all input parcels (scrubbed timepoints, timepoints at beginning of the run, and timepoints following scrubbed timepoints will have DVARS set to -0.001)
* dvars_stats.json: Different statistics about DVARS including (removed TPs not included in any stats):
- mean_dvars_pre_cleaning: temporal average dvars before cleaning
- mean_dvars_post_cleaning: temporal average dvars after cleaning
- dvars_remaining_ratio: mean_dvars_post_cleaning/mean_dvars_pre_cleaning
- max_dvars_pre_cleaning: highest dvars value before cleaning
- max_dvars_post_cleaning: highest dvars value after cleaning
* file_path_dictionary.json: copied from input, containing file paths involved in constructing the parcellated dictionary
* general_info.json: copied from input, containing relevant info such as the name of the subject/session, parcel labels, number of high motion and fd timepoints (calculated from fMRIPREP), etc.
* good_timepoint_inds: the indices for timepoints with defined signal (i.e. everything but the volumes dropped at the beginning of the scan and scrubbed timepoints)
* labels: another copy of the parcel label names
* mean_roi_signal_intensities.json: the mean signal intensities for raw fMRIPREP calculated csf, global_signal, and white_matter variables
* median_ts_intensities: The spatial mean of the temporal median of all voxels/vertices within each parcel (calculated on fMRIPREP output)
* num_good_timepoints: the total number of good timepoints left after scrubbing and removing initial volumes
* std_after_regression: The temporal standard deviation of each parcel's timesignal after nuisance regression (this is calcualated prior to the final filtering of the signal)
* std_before_regression: The temporal standard deviation of each parcel's timesignal prior to nuisance regression (if hpf_before_regression is used, this is calculated after that filtering step)
* std_regression_statistics
- mean_remaining_std_ratio: the average of std_before_regression/std_after_regression across all parcels
- least_remaining_std_ratio: the minimum of std_before_regression/std_after_regression across all parcels
In totallity, processing follows the sequence below:
1. Calculate DVARS on the input time-series.
2. If hpf_before_regression is used, filter the parcellated time-series, and the signals specified by clean_comps_dict, and noise_comps_dict.
3. Calculate the temporal standard deviation for each parcel (for std_before_regression)
3. Fit the signals generated from clean_comps_dict and noise_comps_dict to the parcellated timeseries (using only defined, not scrubbed points) and remove the signal explained from the noise_comps_dict.
4. Calculate the temporal standard deviation for each parcel (for std_after_regression)
5. Interpolate over any scrubbed timepoints
6. Apply either highpass, lowpass, or bandpass filter if specified
7. Set all undefined timepoints to NaN
8. Calculate DVARS on the output time-series
9. Calculate remaining meta-data
#Function inputs:
Expand Down Expand Up @@ -78,27 +175,7 @@ def denoise(parc_dict, hpf_before_regression, scrub_criteria_dictionary, interpo
#if the key represents a grouping of confounds, then you can use the value to specify the
#number of principal components to kept from a reduction of the grouping. If hpf_before_regression
#is used, the filtering will happen after the PCA.
#
#
#


#high_pass, low_pass: Filters to be applied as the last step in processing.
#set as False if you don't want to use them, otherwise set equal to the
#cutoff frequency

#
#If any of the input parameters are set to True, they will be treated as if they were
#set to False, because True values wouldn't mean anything....
#
#
#
#################################################################################################
#################################################################################################
#################################################################################################
#################################################################################################
#################################################################################################
#################################################################################################
"""

n_skip_vols = parc_dict['general_info.json']['n_skip_vols']
TR = parc_dict['general_info.json']['TR']
Expand Down Expand Up @@ -315,6 +392,7 @@ def denoise(parc_dict, hpf_before_regression, scrub_criteria_dictionary, interpo


def interpolate(timepoint_defined, signal, interp_type, TR):
"""
#defined_timepoints should be an array the length of the t with True at timepoints
#that are defined and False at timepoints that are not defined. signal should also
#be an array of length t. Timepoints at defined as False will be overwritten. This
Expand All @@ -329,9 +407,11 @@ def interpolate(timepoint_defined, signal, interp_type, TR):
#(2) cubic_spline - takes 5 closest time points before/after undefined timepoints
#and applies cubic spline to undefined points. Uses defined signal to determine maximum/minimum
#bounds for new interpolated points.
#(3) spectral - yet to be implemented, will be based off of code from the 2014 Power
#(3) spectral based off of code from the 2014 Power
# paper
"""

timepoint_defined = np.array(timepoint_defined)

true_inds = np.where(timepoint_defined == True)[0]
Expand Down Expand Up @@ -435,7 +515,7 @@ def interpolate(timepoint_defined, signal, interp_type, TR):


def load_comps_dict(parc_dict, comps_dict):

"""
#Internal function, which is given a "parc_dict",
#with different useful resting-state properties
#(made by module parc_ts_dictionary), and accesses
Expand All @@ -462,6 +542,7 @@ def load_comps_dict(parc_dict, comps_dict):
#
#PCA is taken while ignoring the n_skip_vols
#
"""

if comps_dict == False:
return False
Expand Down Expand Up @@ -497,11 +578,12 @@ def load_comps_dict(parc_dict, comps_dict):


def reduce_ics(input_matrix, num_dimensions, n_skip_vols):

"""
#Takes input_matrix <num_original_dimensions, num_timepoints>. Returns
#the num_dimensions top PCs from the input_matrix which are derived excluding
#n_skip_vols, but zeros are padded to the beginning of the time series
#in place of the n_skip_vols.
"""


if input_matrix.shape[0] > input_matrix.shape[1]:
Expand All @@ -524,9 +606,10 @@ def reduce_ics(input_matrix, num_dimensions, n_skip_vols):
return pca_time_signal

def demean_normalize(one_d_array):

"""
#Takes a 1d array and subtracts mean, and
#divides by standard deviation
"""

temp_arr = one_d_array - np.nanmean(one_d_array)

Expand All @@ -535,6 +618,7 @@ def demean_normalize(one_d_array):

def find_timepoints_to_scrub(parc_object, scrubbing_dictionary):

"""
#This function is an internal function for the main denoising script.
#The purpose of this function is to return a array valued true for
#volumes to be included in subsequent analyses and a false for volumes
Expand All @@ -545,6 +629,7 @@ def find_timepoints_to_scrub(parc_object, scrubbing_dictionary):
#If you don't want to scrub, just set scrubbing_dictionary equal to False, and
#this script will only get rid of the initial volumes
"""


if type(scrubbing_dictionary) == type(False):
Expand Down
Loading

0 comments on commit 6bd9bae

Please sign in to comment.