SpectraFlares_Doc.rtf

{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}{\f1\fnil\fcharset0 Consolas;}{\f2\fnil Consolas;}}
{\colortbl ;\red0\green77\blue187;\red204\green120\blue50;\red169\green183\blue198;\red98\green151\blue85;\red155\green187\blue89;\red165\green194\blue97;\red0\green0\blue255;\red128\green128\blue128;}
{\*\generator Riched20 10.0.17134}\viewkind4\uc1 
\pard\box\brdrdash\brdrw0 \sa200\sl276\slmult1\tx8640\f0\fs24 Functionalities V 1.0\fs22\par

\pard\box\brdrdash\brdrw0 \sa200\sl276\slmult1 SpectraFlares/RadioBurst (currently known as "Flare_Downloader" in the new radiospectra package)The Aim of the package is to set up a project(enviroment?) to download and read solar flares captured by the e-Callisto instruments, using a main list of flares detected (Monz 2010)\par
The script is build on top of the Radiospectra Package\par
The functionality can be divided in 2 main parts: \par
-In case we have already dowloaded the Flares Database, we can use the script to navigate into the examples as well as extract some examples at will, \par
we can also use the "Model_Utils" methods to normalize/use the data for machine learning purposes.\par
-In case we don't have the database already we can use the script itself to get one.\par
For both cases we need a "base" list with basic info such as Instrument, Date, Frequencies, and Time steps of the flares (The format of such a list is specified <here>)\par
\cf1 date    obse ving  state   start      end       class  sub           qual.  lower   upper      remarks\par
\cf0 100102  0801 1458   BLEN   0856.3    0858.2    III      G             1      180     416      (relative path to \tab\tab\tab\tab\tab\tab\tab\tab\tab the file if already downloaded)\par
\par
\f1\fs28 Code overview\par
\cf2\f2\fs20\lang9\par
\cf0\f1\lang1033 ## Main Method to download Flares, using a dataframe with a format like 2010Test.txt (It can be found in the examples section of the package) \par
\cf1\f2\lang9 def e_Callisto_burst_downloader(data, sort=False, folder="e-Callisto_Flares", exist=False):\cf3\par
\cf4\i """\line     Download a set of burst\f1\lang1033 s\f2\lang9  based on a dataframe\f1\lang1033  with the callisto format adjusting the format of the dates and frequencies. it downloads the data into files of 15 minutes each, i.e means if we have a flare of 1 hour we would have aprox 4 files, to get only one file per flare, follow this by \cf5\i0\f2 e_Callisto_Burst_simplifier\f1 .\cf4\i\f2\lang9\line\line     Args:\line         data: pandas dataframe\line         sort: Python boolean. If 'True', it creates a subset of folders for the flares subtypes\line         folder: name or path of the folder where the flares will be downloaded\line         exist = Python boolean. If 'True' overwrites the path (if already exist)\line     Returns:\line         rclean: Pandas dataframe. Contains the information of all the already downloaded flares,\line             as well as the paths of their respective FITS files.\line         exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line     """\cf3\i0\line     \cf4\i\line\cf0     \i0 start = time.time()\line     data = preprocessing_txt(data)\line     data = microseconds_clean(data)\line     os.makedirs('./\{\}'.format(folder), exist_ok=exist)\line     clean_directions = pd.DataFrame(columns=data.columns)\line     exceptions_frame = pd.DataFrame(columns=data.columns)\line     for index, row in data.iterrows():\line         clean_directions, exceptions_frame = e_Callisto_exceptionSeeker(index, data, clean_directions, exceptions_frame,\line                                                                         folder, sort)\line     rclean_test, rexcept_test = iter_remarks_Cleaners(clean_directions)\line     exceptions_frame = exceptions_frame.append(rexcept_test)\line     end = time.time()\line     print("Download completed in ----- " + str(end - start) + " secs")\line     return rclean_test, exceptions_frame\par
\cf1 def preprocessing_txt(data):\line\cf3     \cf4\i """\line     Preprocessing dataframe with the Callisto Flare Notation\f1\lang1033 , this is done by modifying the time and frequency ranges stored in the dataframe into standard time format\f2\lang9\line\line     Args:\line       data: Pandas dataframe\line     Returns:\line       Preprocessed dataframe\line     """\line\cf0     \i0 data['date'] = data['date'].apply(lambda x: '\{0:0>8\}'.format(x))\line     data['end'] = data['end'].apply(lambda x: '\{0:0>6\}'.format(x))\line     data['start'] = data['start'].apply(lambda x: '\{0:0>6\}'.format(x))\line     data['lower'] = data['lower'].astype(str).map(lambda x: x.rstrip('xX'))\line     data['upper'] = data['upper'].astype(str).map(lambda x: x.rstrip('xX'))\line     # Sanity preserver\line     # data['lower'] = data['lower'].astype(str).str.replace('\\D-', '')\line     # data['upper'] = data['upper'].astype(str).str.replace('\\D+', '')\line     data['remarks'] = data['remarks'].astype(str)\line     return data\par
\cf1 def microseconds_clean(data):\cf3\line     \cf4\i """\line     \f1\lang1033 Simply adjusts the microseconds of the stored time values in order to not loose =>1 second/minute long flares\f2\lang9\line\line     Args:\line       data: Pandas dataframe\line     Returns:\line       Preprocessed dataframe\line     """\line\cf0     \i0 for index, elemen in data.iterrows():\line         if data.loc[index]['start'] == '000nan':\line             continue\line\line         string = data.loc[index]['start']\line         new_start = string[0:4] + str(int(float(string[4:6]) * 60))\line         data['start'].at[index] = new_start\line\line         string = data.loc[index]['end']\line         new_start = string[0:4] + str(int(float(string[4:6]) * 60))\line         data['end'].at[index] = new_start\line\line     return data\par
\f1\lang1033\par
\cf1\f2 def e_Callisto_exceptionSeeker(row_num, dataframe, new_frame, exceptions_fr, folder, sort=False):\cf3\line     \cf4\i """\line     \f1 It aims to download a flare stored in a row of the dataframe, if possible a local folder is created to store the file and the local path to it is saved into "new_frame", if there is an error while downloading the info of the flare is stored in "exceptions_fr"\f2\line\line     Args:\line         data: pandas dataframe\line         sort: Python boolean. If 'True', it creates a subset of folders for the flares subtypes\line         folder: name or path of the folder where the flares will be downloaded\line         exist = Python boolean. If 'True' overwrites the path (if already exist)\line     Returns:\line         rclean: Pandas dataframe. Contains the information of all the already downloaded flares,\line             as well as the paths of their respective FITS files.\line         exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\line     """\line\cf0     \i0 try:\line\line         instrument, year, start, end = range_Generator(row_num, dataframe)\line         start = parse_time(year + ' ' + start)\line         end = parse_time(year + ' ' + end)\line         urls = query(start, end, [instrument])\line\line         if instrument == None:\line             raise Exception\line\line         row = dataframe.loc[row_num]\line         flareType = row['class']\line         subtype = row['sub']\line\line         if sort == True:\line             directory = directorySubtypeGenerator(folder, flareType, subtype)\line         else:\line             directory = directoryFlaretype(folder, flareType)\line\line         dirlist = ''\line         for url in urls:\line             dire = download_file(url, directory)\line             dirlist = dirlist + relpath(dire) + ','\line\line         new_frame = new_frame.append(dataframe.loc[row_num])\line         new_frame.at[row_num, 'remarks'] = dirlist\line         return new_frame, exceptions_fr\line     except:\line         exceptions_fr = exceptions_fr.append(dataframe.loc[row_num])\line         return new_frame, exceptions_fr\par
\par
\cf1\f1\lang9 def range_Generator(row_num, dataframe):\cf3\line     \cf4\i """\line     Generates the required strings to work with CallistoSpectrogram class\lang1033 , basically putting together all the "creator_" methods\f2\lang9\line\line     Args:\line         row_num: index of the row\line         dataframe:  pandas dataframe\line     Returns:\line         Modified time to use with standard Time Libraries\line     """\line     \cf0\i0 row = dataframe.loc[row_num]\line     instrument = creator_instrument(row['lower'], row['upper'])\line     year = creator_date(row['date'])\line     start = creator_time(row['start'])\line     end = creator_time(row['end'])\line     return instrument, year, start, end\par
\par
\f0\fs24\lang1033 The "creator_" methods are meant to adjust the format of the time/frequencies strings inside a dataframe in orter to use them along with the CallistoSpectrogram Methods\fs20\par
\f2 def creator_instrument(lower, upper):\cf3\line     \cf4\i """\line     Generates the aproximated instrument string based in the frequencies, to use directly with CallistoSpectrogram\line\line     Args:\line         Lower: Lower intensity of the flare\line         Upper: Upper intensity of the flare\line\line     Returns:\line         Name of the Instrument based on the frequencies analysed\line     """\line\line\cf0     \i0 lower = int(lower)\line     upper = int(upper)\line     if lower>=1200 and upper<=1800 : return "BLEN5M"\line     if lower>=110 and upper<=870 : return "BLEN7M"\line     #The Upper value is set to 110 in order to download a bigger wide of flares\line     if lower>=20 and upper<=110 : return "BLENSW"\line\cf6\line\cf0 def creator_date(date):\cf3\line     \cf4\i """\line     Creates the date format to use directly with CallistoSpectrogram\line\line     Args:\line         date: date from dataframe\line     Returns:\line         Modified date to use with standard Time Libraries\line     """\line\cf0     \i0 date = date_cleaner(date)\line     date = re.sub(r'((?:(?=(1|.))\\2)\{2\})(?!$)', r'\\1/', date)\line     if int(date[0] + date[1]) > 50:\line         date = '19' + date\line     else:\line         date = '20' + date\line     return date\cf3\line\line\cf0 def creator_time(time):\cf3\line     \cf4\i """\line     Creates the time format to use directly with CallistoSpectrogram\line\line     Args:\line         time: time from dataframe\line     Returns:\line         Modified time to use with standard Time Libraries\line     """\line\cf0     \i0 time = str(time)\line     long = len(time)\line     new = ''\line     for x in range(long): new = new + time[x]\line     return re.sub(r'((?:(?=(1|.))\\2)\{2\})(?!$)', r'\\1:', new)\par
\par
\cf1 def directorySubtypeGenerator(folder, flareType, subtype):\cf3\line     \cf4\i """\line     Generates Directories based in the subtype and type of flares\line\line     Args:\line         folder: root directory\line         flareType: type of flare from dataframe\line         subtype: subtype of flare from dataframe\line     Returns:\line         path to new flare directory\line     """\line\cf0     \i0 if os.path.isdir('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype)) == False:\line         os.makedirs('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\line         return os.path.realpath('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\line     else:\line         return os.path.realpath('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\par
\cf1 def directoryFlaretype(folder, flareType):\cf3\line     \cf4\i """\line     Generates Directories based ONLY in the type of flares\line\line     Args:\line         folder: root directory\line         flareType: type of flare from dataframe\line     Returns:\line         path to new flares directory\line     """\line\cf0     \i0 if os.path.isdir('./\{\}/\{\}'.format(folder, flareType)) == False:\line         os.makedirs('./\{\}/\{\}'.format(folder, flareType))\line         return os.path.realpath('./\{\}/\{\}'.format(folder, flareType))\line     else:\line         return os.path.realpath('./\{\}/\{\}'.format(folder, flareType))\par
\par
\cf1 def remarks_Cleaners(row_num, dataframe, new_frame, exceptions_fr):\cf3\line     \cf4\i """\line     Cleans\f1  the blanks in the\f2  remarks column from an already downloaded dataframe\f1 , this is mainly to avoid issues while trying to work with the dataframe as a whole (i.e. using the simplifier method) \f2\par
    Args:\par
        \f1 row_mun\f2 : \f1 Index in the dataframe\f2\line         data: pandas dataframe\par
        \f1 new_frame\f2 : pandas dataframe\line         \f1 Exeptions_frame\f2 : pandas dataframe\line\line     Returns:\line         \f1 clean_directions\f2 : Pandas dataframe. Contains the information of all the already downloaded flares,\line             as well as the paths of their respective FITS files.\line         exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\line     """\line\cf0     \i0 row = dataframe.loc[row_num]\line     directions = row['remarks']\line\line     if directions != '':\line         new_frame = new_frame.append(dataframe.loc[row_num])\line         return new_frame, exceptions_fr\line     else:\line         exceptions_fr = exceptions_fr.append(dataframe.loc[row_num])\line         return new_frame, exceptions_fr\line\line def iter_remarks_Cleaners(data):\line     \i """Iterates over a dataframe using remarks_Cleaners"""\line     \i0 clean_directions = pd.DataFrame(columns = data.columns)\line     exceptions_frame = pd.DataFrame(columns = data.columns)\line     for index, row in data.iterrows():\line         clean_directions, exceptions_frame = remarks_Cleaners(index, data, clean_directions, exceptions_frame)\line     return clean_directions, exceptions_frame\par
\cf1 def iter_remarks_Cleaners(data):\line\cf3     \cf4\i """\par
Iterates over a dataframe using remarks_Cleaners\par
    Args:\line         data: pandas dataframe\line     Returns:\line         \f1 clean_directions\f2 : Pandas dataframe. Contains the information of all the already downloaded flares\f1 .\f2\line         exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\par
"""\line\cf0     \i0 clean_directions = pd.DataFrame(columns = data.columns)\line     exceptions_frame = pd.DataFrame(columns = data.columns)\line     for index, row in data.iterrows():\line         clean_directions, exceptions_frame = remarks_Cleaners(index, data, clean_directions, exceptions_frame)\line     return clean_directions, exceptions_frame\par
\f0\par
\fs24 ## The next method is used to join several files of flares into one (one file per flare)\par
\cf1\f2\fs20 def e_Callisto_Burst_simplifier(dataframe, folder, sort=False):\cf3\line     \cf4\i """\line     Joins data into time axis per Flare\f1  by iterating over an already downloaded dataframe.\par
\f2\line     Args:\line         dataframe: pandas dataframe.\line         folder: name or path of the folder where the flares will be downloaded.\line     Returns:\line         joined: Pandas dataframe. Df containing the joined Fits files.\line         special: Pandas dataframe. Df containing info about Damaged Flares.\line     """\line\cf0     \i0 start = time.time()\line     os.makedirs('./\{\}'.format(folder))\line     joined = pd.DataFrame(columns=dataframe.columns)\line     special = pd.DataFrame(columns=dataframe.columns)\line\line     for index, elem in dataframe.iterrows():\line\line         directions = dir_Gen(index, dataframe)\line         name = os.path.basename(directions[0])\line         try:\line             bursts_here = CallistoSpectrogram.from_files(directions)\line         except ValueError:\line             print("Damage file found at \\n" + str(dataframe.loc[index]))\line             special = special.append(dataframe.loc[index])\line             continue\line         row = dataframe.loc[index]\line\line         if sort == True:\line             flareType = row['class']\line             subtype = row['sub']\line             directory = directorySubtypeGenerator(folder, flareType, subtype)\line         else:\line             flareType = row['class']\line             directory = directoryFlaretype(folder, flareType)\line\line         path = directory + '{{\field{\*\fldinst{HYPERLINK "\\\\\\\\\{\}'.format(name)"}}{\fldrslt{\\\\\{\}'.format(name)\ul0\cf0}}}}\f2\fs20\line         CallistoSpectrogram.join_many(bursts_here).save(relpath(path))\line         joined = joined.append(dataframe.loc[index])\line         joined.at[index, 'remarks'] = relpath(path)\line     end = time.time()\line     print("\\nJoined after----- " + str(end - start) + " secs\\n")\line     return joined, special\par
\cf1 def dir_Gen(row_num, dataframe):\line\cf3     \cf4\i """\line     Gets the directory of the data from the remarks column\par
    Args:\line         \f1 row_num\f2 : \f1 Index in dataframe\f2\line         \f1 dataframe\f2 :\f1  \f2 pandas dataframe.\line     Returns:\line         \f1 directionsList: List containing path to files\f2\line\line     """\line\cf0\line     \i0 row = dataframe.loc[row_num]\line     directions = row['remarks']\line\line     directionsList = [x.strip() for x in directions.split(',')[:-1]]\line\line     return directionsList\par
\f0\fs24\par
## The following methods are used to have a preview of the dataset\f2\fs20\par
\cf8\f1\lang9 # Peek a flare from Callisto \lang1033 Online\f2\lang9 Database\line\cf1 def Callisto_flare(row_num, dataframe, show_url=False):\line\cf3     \cf4\i """\line     Peek a flare from a row of a given dataframe\f1\lang1033 , from the onine database, this can be used to test and compare files, beware that since we are using CallistoSpectrogram.from_range here the results may look a bit different.\f2\lang9\par
    Args:\line         \f1\lang1033 row_num\f2\lang9 : position of the elem in the dataframe\line         dataframe: pandas dataframe\line         \f1\lang1033 show_url\f2\lang9 : \f1\lang1033 Booleam, if true, the url of the files will be shown \f2\lang9\line\line     Returns:\line         Callistopectrogram obejct\line\line     """\line\line\cf0     \i0 row = dataframe.loc[row_num]\line     instrument, year, start, end = range_Generator(row_num, dataframe)\line     print(instrument)\line     print('  ' + row['lower'], row['upper'])\line     print(creator_date(row['date']))\line     print(start)\line     print(end)\line     if show_url:\line         startQ = parse_time(year + ' ' + start)\line         endQ = parse_time(year + ' ' + end)\line         urls = query(startQ, endQ, [instrument])\line         for url in urls:\line             print(url)\line\line     Spectra = CallistoSpectrogram.from_range(instrument, year + ' ' + start, year + ' ' + end)\line     Spectra.peek()\line     return Spectra\line\cf3\line\cf1 def Callisto_simple_flare(index, dataframe):\line\cf3     \cf4\i """\line     Peeks a spectrogram\f1\lang1033  from an already downloaded dataset, using the index with "loc"\f2\lang9\line\line     Args:\line         index: position of the elem in the dataframe\line         dataframe: pandas dataframe\line     Returns:\line         Callistopectrogram obejct\line     """\line\cf0     \i0 Spectra = CallistoSpectrogram.read(dataframe.loc[index]['remarks'])\line     Spectra.peek()\line     return Spectra\line\cf3\line\cf1 def Callisto_simple_iflare(index, dataframe):\cf3\line     \cf4\i """\line     Peeks a spectrogram\f1\lang1033  from an already downloaded dataset, using the index with "iloc"\f2\lang9\line\line     Args:\line         index: INDEX of the elem in the dataframe\line         dataframe: pandas dataframe\line     Returns:\line         Callistopectrogram obejct\line     """\line\cf0     \i0 Spectra = CallistoSpectrogram.read(dataframe.iloc[index]['remarks'])\line     Spectra.peek()\line     return Spectra\line\cf3\line\cf1 def preview(dataframe, show_details = True):\cf3\line     \cf4\i """\line     Show a preview of \f1\lang1033 the whole dataframe, if we want to limit the number of elements being showed, we can just slice the dataframe (like df[0:10])\f2\lang9\line       Args:\line           dataframe: pandas dataframe\line           show_details: Boolean, if true shows information about the flare\line       Returns:\line           CallistoSpectrogram objects\line       """\line\cf0     \i0 for index, elem in dataframe.iterrows():\line         row = dataframe.loc[index]\line         instrument, year, start, end = range_Generator(index, dataframe)\line         if show_details:\line             print("Type "+str(row['class']))\line             print('  Range ' + row['lower'], row['upper'])\line             print(start)\line             print(end)\line             print(creator_date(row['date']))\line         Callisto_simple_flare(index, dataframe)\line\cf3\par
\cf0\f1\par

\pard\sa200\sl276\slmult1\f0\fs22\par
\par
}
 