forked from kenfus/radiospectra
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathSpectraFlares_Doc.rtf
57 lines (55 loc) · 20.6 KB
/
SpectraFlares_Doc.rtf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}{\f1\fnil\fcharset0 Consolas;}{\f2\fnil Consolas;}}
{\colortbl ;\red0\green77\blue187;\red204\green120\blue50;\red169\green183\blue198;\red98\green151\blue85;\red155\green187\blue89;\red165\green194\blue97;\red0\green0\blue255;\red128\green128\blue128;}
{\*\generator Riched20 10.0.17134}\viewkind4\uc1
\pard\box\brdrdash\brdrw0 \sa200\sl276\slmult1\tx8640\f0\fs24 Functionalities V 1.0\fs22\par
\pard\box\brdrdash\brdrw0 \sa200\sl276\slmult1 SpectraFlares/RadioBurst (currently known as "Flare_Downloader" in the new radiospectra package)The Aim of the package is to set up a project(enviroment?) to download and read solar flares captured by the e-Callisto instruments, using a main list of flares detected (Monz 2010)\par
The script is build on top of the Radiospectra Package\par
The functionality can be divided in 2 main parts: \par
-In case we have already dowloaded the Flares Database, we can use the script to navigate into the examples as well as extract some examples at will, \par
we can also use the "Model_Utils" methods to normalize/use the data for machine learning purposes.\par
-In case we don't have the database already we can use the script itself to get one.\par
For both cases we need a "base" list with basic info such as Instrument, Date, Frequencies, and Time steps of the flares (The format of such a list is specified <here>)\par
\cf1 date obse ving state start end class sub qual. lower upper remarks\par
\cf0 100102 0801 1458 BLEN 0856.3 0858.2 III G 1 180 416 (relative path to \tab\tab\tab\tab\tab\tab\tab\tab\tab the file if already downloaded)\par
\par
\f1\fs28 Code overview\par
\cf2\f2\fs20\lang9\par
\cf0\f1\lang1033 ## Main Method to download Flares, using a dataframe with a format like 2010Test.txt (It can be found in the examples section of the package) \par
\cf1\f2\lang9 def e_Callisto_burst_downloader(data, sort=False, folder="e-Callisto_Flares", exist=False):\cf3\par
\cf4\i """\line Download a set of burst\f1\lang1033 s\f2\lang9 based on a dataframe\f1\lang1033 with the callisto format adjusting the format of the dates and frequencies. it downloads the data into files of 15 minutes each, i.e means if we have a flare of 1 hour we would have aprox 4 files, to get only one file per flare, follow this by \cf5\i0\f2 e_Callisto_Burst_simplifier\f1 .\cf4\i\f2\lang9\line\line Args:\line data: pandas dataframe\line sort: Python boolean. If 'True', it creates a subset of folders for the flares subtypes\line folder: name or path of the folder where the flares will be downloaded\line exist = Python boolean. If 'True' overwrites the path (if already exist)\line Returns:\line rclean: Pandas dataframe. Contains the information of all the already downloaded flares,\line as well as the paths of their respective FITS files.\line exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line """\cf3\i0\line \cf4\i\line\cf0 \i0 start = time.time()\line data = preprocessing_txt(data)\line data = microseconds_clean(data)\line os.makedirs('./\{\}'.format(folder), exist_ok=exist)\line clean_directions = pd.DataFrame(columns=data.columns)\line exceptions_frame = pd.DataFrame(columns=data.columns)\line for index, row in data.iterrows():\line clean_directions, exceptions_frame = e_Callisto_exceptionSeeker(index, data, clean_directions, exceptions_frame,\line folder, sort)\line rclean_test, rexcept_test = iter_remarks_Cleaners(clean_directions)\line exceptions_frame = exceptions_frame.append(rexcept_test)\line end = time.time()\line print("Download completed in ----- " + str(end - start) + " secs")\line return rclean_test, exceptions_frame\par
\cf1 def preprocessing_txt(data):\line\cf3 \cf4\i """\line Preprocessing dataframe with the Callisto Flare Notation\f1\lang1033 , this is done by modifying the time and frequency ranges stored in the dataframe into standard time format\f2\lang9\line\line Args:\line data: Pandas dataframe\line Returns:\line Preprocessed dataframe\line """\line\cf0 \i0 data['date'] = data['date'].apply(lambda x: '\{0:0>8\}'.format(x))\line data['end'] = data['end'].apply(lambda x: '\{0:0>6\}'.format(x))\line data['start'] = data['start'].apply(lambda x: '\{0:0>6\}'.format(x))\line data['lower'] = data['lower'].astype(str).map(lambda x: x.rstrip('xX'))\line data['upper'] = data['upper'].astype(str).map(lambda x: x.rstrip('xX'))\line # Sanity preserver\line # data['lower'] = data['lower'].astype(str).str.replace('\\D-', '')\line # data['upper'] = data['upper'].astype(str).str.replace('\\D+', '')\line data['remarks'] = data['remarks'].astype(str)\line return data\par
\cf1 def microseconds_clean(data):\cf3\line \cf4\i """\line \f1\lang1033 Simply adjusts the microseconds of the stored time values in order to not loose =>1 second/minute long flares\f2\lang9\line\line Args:\line data: Pandas dataframe\line Returns:\line Preprocessed dataframe\line """\line\cf0 \i0 for index, elemen in data.iterrows():\line if data.loc[index]['start'] == '000nan':\line continue\line\line string = data.loc[index]['start']\line new_start = string[0:4] + str(int(float(string[4:6]) * 60))\line data['start'].at[index] = new_start\line\line string = data.loc[index]['end']\line new_start = string[0:4] + str(int(float(string[4:6]) * 60))\line data['end'].at[index] = new_start\line\line return data\par
\f1\lang1033\par
\cf1\f2 def e_Callisto_exceptionSeeker(row_num, dataframe, new_frame, exceptions_fr, folder, sort=False):\cf3\line \cf4\i """\line \f1 It aims to download a flare stored in a row of the dataframe, if possible a local folder is created to store the file and the local path to it is saved into "new_frame", if there is an error while downloading the info of the flare is stored in "exceptions_fr"\f2\line\line Args:\line data: pandas dataframe\line sort: Python boolean. If 'True', it creates a subset of folders for the flares subtypes\line folder: name or path of the folder where the flares will be downloaded\line exist = Python boolean. If 'True' overwrites the path (if already exist)\line Returns:\line rclean: Pandas dataframe. Contains the information of all the already downloaded flares,\line as well as the paths of their respective FITS files.\line exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\line """\line\cf0 \i0 try:\line\line instrument, year, start, end = range_Generator(row_num, dataframe)\line start = parse_time(year + ' ' + start)\line end = parse_time(year + ' ' + end)\line urls = query(start, end, [instrument])\line\line if instrument == None:\line raise Exception\line\line row = dataframe.loc[row_num]\line flareType = row['class']\line subtype = row['sub']\line\line if sort == True:\line directory = directorySubtypeGenerator(folder, flareType, subtype)\line else:\line directory = directoryFlaretype(folder, flareType)\line\line dirlist = ''\line for url in urls:\line dire = download_file(url, directory)\line dirlist = dirlist + relpath(dire) + ','\line\line new_frame = new_frame.append(dataframe.loc[row_num])\line new_frame.at[row_num, 'remarks'] = dirlist\line return new_frame, exceptions_fr\line except:\line exceptions_fr = exceptions_fr.append(dataframe.loc[row_num])\line return new_frame, exceptions_fr\par
\par
\cf1\f1\lang9 def range_Generator(row_num, dataframe):\cf3\line \cf4\i """\line Generates the required strings to work with CallistoSpectrogram class\lang1033 , basically putting together all the "creator_" methods\f2\lang9\line\line Args:\line row_num: index of the row\line dataframe: pandas dataframe\line Returns:\line Modified time to use with standard Time Libraries\line """\line \cf0\i0 row = dataframe.loc[row_num]\line instrument = creator_instrument(row['lower'], row['upper'])\line year = creator_date(row['date'])\line start = creator_time(row['start'])\line end = creator_time(row['end'])\line return instrument, year, start, end\par
\par
\f0\fs24\lang1033 The "creator_" methods are meant to adjust the format of the time/frequencies strings inside a dataframe in orter to use them along with the CallistoSpectrogram Methods\fs20\par
\f2 def creator_instrument(lower, upper):\cf3\line \cf4\i """\line Generates the aproximated instrument string based in the frequencies, to use directly with CallistoSpectrogram\line\line Args:\line Lower: Lower intensity of the flare\line Upper: Upper intensity of the flare\line\line Returns:\line Name of the Instrument based on the frequencies analysed\line """\line\line\cf0 \i0 lower = int(lower)\line upper = int(upper)\line if lower>=1200 and upper<=1800 : return "BLEN5M"\line if lower>=110 and upper<=870 : return "BLEN7M"\line #The Upper value is set to 110 in order to download a bigger wide of flares\line if lower>=20 and upper<=110 : return "BLENSW"\line\cf6\line\cf0 def creator_date(date):\cf3\line \cf4\i """\line Creates the date format to use directly with CallistoSpectrogram\line\line Args:\line date: date from dataframe\line Returns:\line Modified date to use with standard Time Libraries\line """\line\cf0 \i0 date = date_cleaner(date)\line date = re.sub(r'((?:(?=(1|.))\\2)\{2\})(?!$)', r'\\1/', date)\line if int(date[0] + date[1]) > 50:\line date = '19' + date\line else:\line date = '20' + date\line return date\cf3\line\line\cf0 def creator_time(time):\cf3\line \cf4\i """\line Creates the time format to use directly with CallistoSpectrogram\line\line Args:\line time: time from dataframe\line Returns:\line Modified time to use with standard Time Libraries\line """\line\cf0 \i0 time = str(time)\line long = len(time)\line new = ''\line for x in range(long): new = new + time[x]\line return re.sub(r'((?:(?=(1|.))\\2)\{2\})(?!$)', r'\\1:', new)\par
\par
\cf1 def directorySubtypeGenerator(folder, flareType, subtype):\cf3\line \cf4\i """\line Generates Directories based in the subtype and type of flares\line\line Args:\line folder: root directory\line flareType: type of flare from dataframe\line subtype: subtype of flare from dataframe\line Returns:\line path to new flare directory\line """\line\cf0 \i0 if os.path.isdir('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype)) == False:\line os.makedirs('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\line return os.path.realpath('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\line else:\line return os.path.realpath('./\{\}/\{\}/\{\}'.format(folder, flareType, subtype))\par
\cf1 def directoryFlaretype(folder, flareType):\cf3\line \cf4\i """\line Generates Directories based ONLY in the type of flares\line\line Args:\line folder: root directory\line flareType: type of flare from dataframe\line Returns:\line path to new flares directory\line """\line\cf0 \i0 if os.path.isdir('./\{\}/\{\}'.format(folder, flareType)) == False:\line os.makedirs('./\{\}/\{\}'.format(folder, flareType))\line return os.path.realpath('./\{\}/\{\}'.format(folder, flareType))\line else:\line return os.path.realpath('./\{\}/\{\}'.format(folder, flareType))\par
\par
\cf1 def remarks_Cleaners(row_num, dataframe, new_frame, exceptions_fr):\cf3\line \cf4\i """\line Cleans\f1 the blanks in the\f2 remarks column from an already downloaded dataframe\f1 , this is mainly to avoid issues while trying to work with the dataframe as a whole (i.e. using the simplifier method) \f2\par
Args:\par
\f1 row_mun\f2 : \f1 Index in the dataframe\f2\line data: pandas dataframe\par
\f1 new_frame\f2 : pandas dataframe\line \f1 Exeptions_frame\f2 : pandas dataframe\line\line Returns:\line \f1 clean_directions\f2 : Pandas dataframe. Contains the information of all the already downloaded flares,\line as well as the paths of their respective FITS files.\line exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\line """\line\cf0 \i0 row = dataframe.loc[row_num]\line directions = row['remarks']\line\line if directions != '':\line new_frame = new_frame.append(dataframe.loc[row_num])\line return new_frame, exceptions_fr\line else:\line exceptions_fr = exceptions_fr.append(dataframe.loc[row_num])\line return new_frame, exceptions_fr\line\line def iter_remarks_Cleaners(data):\line \i """Iterates over a dataframe using remarks_Cleaners"""\line \i0 clean_directions = pd.DataFrame(columns = data.columns)\line exceptions_frame = pd.DataFrame(columns = data.columns)\line for index, row in data.iterrows():\line clean_directions, exceptions_frame = remarks_Cleaners(index, data, clean_directions, exceptions_frame)\line return clean_directions, exceptions_frame\par
\cf1 def iter_remarks_Cleaners(data):\line\cf3 \cf4\i """\par
Iterates over a dataframe using remarks_Cleaners\par
Args:\line data: pandas dataframe\line Returns:\line \f1 clean_directions\f2 : Pandas dataframe. Contains the information of all the already downloaded flares\f1 .\f2\line exceptions_frame: Pandas dataframe. Contains information about files that could not be downloaded\line\par
"""\line\cf0 \i0 clean_directions = pd.DataFrame(columns = data.columns)\line exceptions_frame = pd.DataFrame(columns = data.columns)\line for index, row in data.iterrows():\line clean_directions, exceptions_frame = remarks_Cleaners(index, data, clean_directions, exceptions_frame)\line return clean_directions, exceptions_frame\par
\f0\par
\fs24 ## The next method is used to join several files of flares into one (one file per flare)\par
\cf1\f2\fs20 def e_Callisto_Burst_simplifier(dataframe, folder, sort=False):\cf3\line \cf4\i """\line Joins data into time axis per Flare\f1 by iterating over an already downloaded dataframe.\par
\f2\line Args:\line dataframe: pandas dataframe.\line folder: name or path of the folder where the flares will be downloaded.\line Returns:\line joined: Pandas dataframe. Df containing the joined Fits files.\line special: Pandas dataframe. Df containing info about Damaged Flares.\line """\line\cf0 \i0 start = time.time()\line os.makedirs('./\{\}'.format(folder))\line joined = pd.DataFrame(columns=dataframe.columns)\line special = pd.DataFrame(columns=dataframe.columns)\line\line for index, elem in dataframe.iterrows():\line\line directions = dir_Gen(index, dataframe)\line name = os.path.basename(directions[0])\line try:\line bursts_here = CallistoSpectrogram.from_files(directions)\line except ValueError:\line print("Damage file found at \\n" + str(dataframe.loc[index]))\line special = special.append(dataframe.loc[index])\line continue\line row = dataframe.loc[index]\line\line if sort == True:\line flareType = row['class']\line subtype = row['sub']\line directory = directorySubtypeGenerator(folder, flareType, subtype)\line else:\line flareType = row['class']\line directory = directoryFlaretype(folder, flareType)\line\line path = directory + '{{\field{\*\fldinst{HYPERLINK "\\\\\\\\\{\}'.format(name)"}}{\fldrslt{\\\\\{\}'.format(name)\ul0\cf0}}}}\f2\fs20\line CallistoSpectrogram.join_many(bursts_here).save(relpath(path))\line joined = joined.append(dataframe.loc[index])\line joined.at[index, 'remarks'] = relpath(path)\line end = time.time()\line print("\\nJoined after----- " + str(end - start) + " secs\\n")\line return joined, special\par
\cf1 def dir_Gen(row_num, dataframe):\line\cf3 \cf4\i """\line Gets the directory of the data from the remarks column\par
Args:\line \f1 row_num\f2 : \f1 Index in dataframe\f2\line \f1 dataframe\f2 :\f1 \f2 pandas dataframe.\line Returns:\line \f1 directionsList: List containing path to files\f2\line\line """\line\cf0\line \i0 row = dataframe.loc[row_num]\line directions = row['remarks']\line\line directionsList = [x.strip() for x in directions.split(',')[:-1]]\line\line return directionsList\par
\f0\fs24\par
## The following methods are used to have a preview of the dataset\f2\fs20\par
\cf8\f1\lang9 # Peek a flare from Callisto \lang1033 Online\f2\lang9 Database\line\cf1 def Callisto_flare(row_num, dataframe, show_url=False):\line\cf3 \cf4\i """\line Peek a flare from a row of a given dataframe\f1\lang1033 , from the onine database, this can be used to test and compare files, beware that since we are using CallistoSpectrogram.from_range here the results may look a bit different.\f2\lang9\par
Args:\line \f1\lang1033 row_num\f2\lang9 : position of the elem in the dataframe\line dataframe: pandas dataframe\line \f1\lang1033 show_url\f2\lang9 : \f1\lang1033 Booleam, if true, the url of the files will be shown \f2\lang9\line\line Returns:\line Callistopectrogram obejct\line\line """\line\line\cf0 \i0 row = dataframe.loc[row_num]\line instrument, year, start, end = range_Generator(row_num, dataframe)\line print(instrument)\line print(' ' + row['lower'], row['upper'])\line print(creator_date(row['date']))\line print(start)\line print(end)\line if show_url:\line startQ = parse_time(year + ' ' + start)\line endQ = parse_time(year + ' ' + end)\line urls = query(startQ, endQ, [instrument])\line for url in urls:\line print(url)\line\line Spectra = CallistoSpectrogram.from_range(instrument, year + ' ' + start, year + ' ' + end)\line Spectra.peek()\line return Spectra\line\cf3\line\cf1 def Callisto_simple_flare(index, dataframe):\line\cf3 \cf4\i """\line Peeks a spectrogram\f1\lang1033 from an already downloaded dataset, using the index with "loc"\f2\lang9\line\line Args:\line index: position of the elem in the dataframe\line dataframe: pandas dataframe\line Returns:\line Callistopectrogram obejct\line """\line\cf0 \i0 Spectra = CallistoSpectrogram.read(dataframe.loc[index]['remarks'])\line Spectra.peek()\line return Spectra\line\cf3\line\cf1 def Callisto_simple_iflare(index, dataframe):\cf3\line \cf4\i """\line Peeks a spectrogram\f1\lang1033 from an already downloaded dataset, using the index with "iloc"\f2\lang9\line\line Args:\line index: INDEX of the elem in the dataframe\line dataframe: pandas dataframe\line Returns:\line Callistopectrogram obejct\line """\line\cf0 \i0 Spectra = CallistoSpectrogram.read(dataframe.iloc[index]['remarks'])\line Spectra.peek()\line return Spectra\line\cf3\line\cf1 def preview(dataframe, show_details = True):\cf3\line \cf4\i """\line Show a preview of \f1\lang1033 the whole dataframe, if we want to limit the number of elements being showed, we can just slice the dataframe (like df[0:10])\f2\lang9\line Args:\line dataframe: pandas dataframe\line show_details: Boolean, if true shows information about the flare\line Returns:\line CallistoSpectrogram objects\line """\line\cf0 \i0 for index, elem in dataframe.iterrows():\line row = dataframe.loc[index]\line instrument, year, start, end = range_Generator(index, dataframe)\line if show_details:\line print("Type "+str(row['class']))\line print(' Range ' + row['lower'], row['upper'])\line print(start)\line print(end)\line print(creator_date(row['date']))\line Callisto_simple_flare(index, dataframe)\line\cf3\par
\cf0\f1\par
\pard\sa200\sl276\slmult1\f0\fs22\par
\par
}