BGCA is a tool written for the automated analysis of bacterial growth curves from 96 well plates. The user inputs the used plate layout through the GUI, and the supplied parameters are then used to calculate curve-specific parameters, such as maximum yield, maximum slope, area under the curve (AUC) and length of lag phase. If desired, ecotoxicological measures such as LOEC/NOEC and MIC can be calculated, either based on statistical analysis of selected curve parameters or user defined cutoffs compared to control samples. The results can be explored through an interactive plotting GUI and exported to excel for further Analysis. BGCA is designed to take diverse plate layouts into account, and has options for dealing with sample replicated, positive controls and background samples.
The BGCA interface can be run from either the command line, using python /path/to/main.py
(of course replacing /path/to
with the local path to main.py
) or on windows by running the provided .exe file (if available). This requires
python>=3.10, but python<3.12.
NOTE: BGCA is currently undergoing active development, so crashes and bugs, as well as minor changes in functionality might still occur.
The variety of experimental setups for growth experiments in a 96 well plate is vast. The setups extensively tested with BGCA are shown below. While analysing different setups using BGCA is certainly possible, there may be bugs that have not been found yet, which may cause BGCA to crash.
In a dose-response experimental setting, concentration gradients should go from high concentrations (left on plate) to low concentrations (right on plate). Positive controls (in this context wells containing only bacteria, no drug/chemical are refered to as positive controls) should be placed on the right end of the plate. Replicates should be organized row-wise (R1 and R2 in the figure). If background samples are present, they should be organized in the same manner as bacterial samples, row-wise.
Characterization experiments can be set up either row wise, as shown above (without positive and background samples), or column wise, as shown below. In this case, three horizontally adjacent wells provide replicates for a single bacterial sample. When using this plate layout, replicates should be provided to the BGCA interface as A01:A02:A03, A04:A05:A06, ... and so on (see section 'usage').
BGCA is originally designed to take Omnilog time series data exported to excel as input, but can be used to analyse any time series data, if the format is adjusted such as to mirror the Omnilog output format. The currently supported input format is as follows: An excel table with a maximum of 97 columns - The first column being named 'Hour', the following columns being a combination of the letters A-H and numbers 1-12 (8 rows on a 96 well plate, symbolized by the letters, 12 columns per row). An example input is shown in the image below, and can also be found in the provided example file (https://github.com/EbmeyerSt/bgca/blob/main/example.xlsx).
![example_input](https://private-user-images.githubusercontent.com/11669686/296772223-43803b79-6adc-45ac-ba8a-2c29a5926056.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MTc5OTMsIm5iZiI6MTczOTQxNzY5MywicGF0aCI6Ii8xMTY2OTY4Ni8yOTY3NzIyMjMtNDM4MDNiNzktNmFkYy00NWFjLWJhOGEtMmMyOWE1OTI2MDU2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDAzMzQ1M1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQ5NmQ0YTZmZDRhNGQxNTU2YWE5NjRhYmRkZWZhYmY1NmNjOGJiNGM4MTA2YmNjOTgwN2I3MmQwNzRjYmFiMDcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.DrqcJpiPGrKIjGYQ0NtHlwrqS2u2eNqdcOA5FIvAPwQ)
BGCA has a multitude of options to specify experimental setups. You can provide which rows or columns on the plate are replicates of one another, which ones are background samples for others, whether positive controls (in this context, meaning wells where only bacteria, but no growth modifying agent was inoculated). These setups are specified in the upper part of the BGCA main windoww, as shown below.
Specific input formats for each of the forms in the BGCA mainwindow are specified below. Note that a help text will appear for each field when hovering the mouse over the respective fields title.
Replicates: Replicates should be provided either row-wise or column wise. If the plate contains no replicates, this field can be left blank. If row-wise: 'A:B, C:D, E:F, G:H' means that rows A and B ae replicates, row C and D are replicates, and so forth. Similarily, 'A:B:C:D, E:F:G:H' would implicate that rows A, B, C and D are replicates of one another and E, F, G and H are replicates of one another. If column wise: 'A01:A02:A03, A04:A05:A06, A07:A08:A09, ...' indicates that columns 1-3 in row A are replicates of one another, and so on.
Background rows: Backgrounds can to date only be provided row-wise. 'AB:CD, EF:GH' indicates that rows C and D provide the background for rows A and B, rows G and H provide the background for rows E and F. If no background is included in the plate setup, this field should be left blank.
Plate columns used: Drop-down list that can be used to specify how many of the 12 plate columns are used in the plate layout.
Average replicates: Check to average replicate rows or columns for calculating curve parameters.
Positive controls: Positive controls can be provided in the following format: 'A12:A, B12:B, ...' means that wells A12 abd B12 provide the positive controls for rows A and B. If several positive controls per row are present, specify as e.g. 'A11+A12:A, B11+B12:B, ...', meaning that well A11 and A12 provide positive controls for row A, wells B11 and B12 provide the background for row B and so on.
Smoothen curves: Fit a generalized additive model to each curve, smoothening the curve and removing noise. The smoothened curves are then used for calculating the curve parameters. Note that these fitted curves currently are monotonic, meaning they will not model a decrease in Omnilog Units after previous increases.
Concentrations: Can either be provided as a list of concentrations (e.g 1, 0.75, 0.5, 0.25, ...) or as a dilution series as 'highest_concentration:dilution' factor (e.g 12:4) Unit: String that specifies the unit for the Concentrations field, e.g ug/ml, mg/l, etc.
It is possible to select methods for the calculation of length of lag phase, LOEC/NOEC and MIC in the lower part of BGCAs main window:
Lag-time calculation: Decide how end of lag phase should be calculated. Selecting 'OD value' and providing an integer threshold value to the 'OD value' field to the right will calculat the exact time point at which the Omnilog Units on the y-axis of the curve will pass that value. Selecting '% max. OD' and providing an integer threshold value will calculate the exact timepoint when the Omnilog Units on the y-axis pass the supplied percentage of the maximum OD.
LOEC calculation: Drop-down list with several available options for calculating LOEC/NOEC values, and the values are calculated based on compairison of either the calculated lag-time, the AUC, the slope or the yield. The selected parameter can then either be compared to a user-supplied cutoff value, which is a percentage of the positive control for the respective row. The lowest concentration at which the provided threshold is passed is assigned the LOEC, the next lower concentration is assigned NOEC. Alternatively ANOVA followed by Dunnet's test is performed, and the lowest concentration at which the mean of the selected parameter is significantly different (alpha<=0.05) from other curves is assigned LOEC, the next lower concentration is assigned NOEC. If no LOEC/NOEC should be calculated, select 'None' in the list.
MIC calculation: Select 'max. OD' (currently the only method available for MIC calculation) and provide a Omnilog Unit threshold value. The lowest concentration where the Ominlog Units never cross the specified threshold value is assigned as MIC. If no MIC should be calculated, select 'None'.
Once all fields for the calculation of the curve parameters are specified, clicking 'submit' will calculate the curve parameters and allow the user to continue to the plotting window.
BGCAs plotting window allows the user to selectively, visually explore the growth data provided in the input file and asses the calculated parameters. The results can then be exported to Excel.
Rows to plot: Takes a list of letters as input. E.g. providing 'a, b' will plot data from the rows A and B. Specifying 'Columns to plot' as well is mandatory.
Columns to plot: Takes a list or a range of columns to plot. E.g. '1, 2, 3, 4' or '1-4' will plot columns one to four. Specifying 'Rows to plot' as well is mandatory.
Curve type: Drop-down list to select the curve type to plot. Raw plots the raw (input) data, Raw processed plots the averaged and/or background-substracted data, Smoothened plots the smoothened curves if the respective box has been checked in the BGCA main window.
Clicking the Save button at the buttom of the window will export the data and calculated curve parameters to Excel. The corresponding output file has four sheets: raw_data(containing the raw data), calc_data(containing the averaged and background substracted data, if applicable), metrics (containing the calculated metrics, see figure below) and plot (containing a plot of all curves).
![output_metrics_example](https://private-user-images.githubusercontent.com/11669686/296989041-8f7f8835-ca80-478a-9899-471a7830953f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MTc5OTMsIm5iZiI6MTczOTQxNzY5MywicGF0aCI6Ii8xMTY2OTY4Ni8yOTY5ODkwNDEtOGY3Zjg4MzUtY2E4MC00NzhhLTk4OTktNDcxYTc4MzA5NTNmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDAzMzQ1M1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRmMTZmYTA2MWUyOTA4ODQyMmQ2MDdiZTdlZDRkMTVlYTRjNDczZDMwZTM5NGI4ZmE2NDViZWFhNzhkYTFlNmUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.rkHj2WLwZJMCc_XquMU7sl76renNBw8EoeDU1S-g0Ec)
This section provides details on how the output metrics ae calculated by BGCA.
max_yield: The maximum Omnilog Unit value of the curve.
AUC: Scikit-learn's auc() function is used to calculate the AUC for each curve, using the trapezoidal rule.
lag_len: For % max. OD, the exact timepoint when the OD/Omnilog Units pass the specified cutoff is calculated. This is done by determining the first measured timepoint at which the threshold value has been passed, and the measured time point just before the threshold value is passed. The exact time at which OD/Omnilog Units > threshold is then determined by calculating a straight line between the points, according to y=mx+b, where m=(y2-y1)/(x2-x1), b=y1-m*x1 and x(threshold)=(y(threshold value)-b)/m
slope: Calculated through finding the greatest difference between the first and last of 4 values while using a sliding window approach over the entire curve. The slope is then (y2-y1)/(x2-x1).