diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/404.html b/404.html new file mode 100644 index 0000000..116346a --- /dev/null +++ b/404.html @@ -0,0 +1,1320 @@ + + + +
+ + + + + + + + + + + + + + + +Welcome to the R and Python bilingualism reference guide! If you’re +fluent in one of these languages but hesitant to learn the other, you’re +in the right place. The good news is that there are many similarities +between R and Python that make it easy to switch between the two.
+Both R and Python are widely used in data science and are open-source, +meaning that they are free to use and constantly being improved by the +community. They both have extensive libraries for data analysis, +visualization, and machine learning. In fact, many of the libraries in +both languages have similar names and functions, such as Pandas in +Python and data.table in R.
+While there are differences between the two languages, they can +complement each other well. Python is versatile and scalable, making it +ideal for large and complex projects such as web development and +artificial intelligence. R, on the other hand, is known for its +exceptional statistical capabilities and is often used in data analysis +and modeling. Visualization is also easier in R, making it a popular +choice for creating graphs and charts.
+By learning both R and Python, you’ll be able to take advantage of the +strengths of each language and create more efficient and robust data +analysis workflows. Don’t let the differences between the two languages +intimidate you - once you become familiar with one, learning the other +will be much easier.
+So, whether you’re a Python enthusiast looking to expand your +statistical analysis capabilities, or an R user interested in exploring +the world of web development and artificial intelligence, this guide +will help you become bilingual in R and Python.
+In R, packages can be installed from CRAN repository by using the +install.packages() function:
+R code:
+# Install the dplyr package from CRAN
+install.packages("dplyr")
+
In Python, packages can be installed from the Anaconda repository by +using the conda install command:
+Python code:
+# Install the pandas package from Anaconda
+!conda install pandas
+
Loading libraries in R and Python
+In R, libraries can be loaded in the same way as before, using the +library() function:
+R code:
+# Load the dplyr library
+library(dplyr)
+
In Python, libraries can be loaded in the same way as before, using the +import statement. Here’s an example:
+Python code:
+# Load the pandas library
+import pandas as pd
+
Note that the package or library must be installed from the respective +repository before it can be loaded. Also, make sure you have the correct +repository specified in your system before installing packages. By +default, R uses CRAN as its primary repository, whereas Anaconda uses +its own repository by default.
+The reticulate package lets you run both R and Python together in the R +environment.
+R libraries are stored and managed in a repository called CRAN. You can +download R packages with the install.packages() function
+install.packages("reticulate")
+
You only need to install packages once, but you need to mount those +packages with the library() function each time you open R.
+library(reticulate)
+
Python libraries are stored and managed in a few different libraries and +their dependencies are not regulated as strictly as R libraries are in +CRAN. It’s easier to publish a python package but it can also be more +cumbersome for users because you need to manage dependencies yourself. +You can download python packages using both R and Python code
+py_install("laspy")
+
## + '/Users/ty/opt/miniconda3/bin/conda' 'install' '--yes' '--prefix' '/Users/ty/opt/miniconda3/envs/earth-analytics-python' '-c' 'conda-forge' 'laspy'
+
Now, let’s create a Python list and assign it to a variable py_list:
+R code:
+py_list <- r_to_py(list(1, 2, 3))
+
We can now print out the py_list variable in Python using the +py_run_string() function:
+R code:
+py_run_string("print(r.py_list)")
+
This will output [1, 2, 3] in the Python console.
+Now, let’s create an R vector and assign it to a variable r_vec:
+R code:
+r_vec <- c(4, 5, 6)
+
We can now print out the r_vec variable in R using the py$ syntax to +access Python variables:
+R code:
+print(py$py_list)
+
This will output [1, 2, 3] in the R console.
+We can also call Python functions from R using the py_call() function. +For example, let’s call the Python sum() function on the py_list +variable and assign the result to an R variable r_sum:
+R code:
+r_sum <- py_call("sum", args = list(py_list))
+
We can now print out the r_sum variable in R:
+R code:
+print(r_sum)
+
This will output 6 in the R console.
+options(java.parameters = "-Xmx5G")
+
+library(r5r)
+library(sf)
+library(data.table)
+library(ggplot2)
+library(interp)
+library(dplyr)
+library(osmdata)
+library(ggthemes)
+library(sf)
+library(data.table)
+library(ggplot2)
+library(akima)
+library(dplyr)
+library(raster)
+library(osmdata)
+library(mapview)
+library(cowplot)
+library(here)
+library(testthat)
+
import sys
+sys.argv.append(["--max-memory", "5G"])
+
+import pandas as pd
+import geopandas
+import matplotlib.pyplot as plt
+import numpy as np
+import plotnine
+import contextily as cx
+import r5py
+import seaborn as sns
+
R and Python are two popular programming languages used for data +analysis, statistics, and machine learning. Although they share some +similarities, there are some fundamental differences between them. +Here’s an example code snippet in R and Python to illustrate some of the +differences:
+R Code:
+# Create a vector of numbers from 1 to 10
+x <- 1:10
+
+# Compute the mean of the vector
+mean_x <- mean(x)
+
+# Print the result
+print(mean_x)
+
## [1] 5.5
+
Python Code:
+# Import the numpy library for numerical operations
+import numpy as np
+
+# Create a numpy array of numbers from 1 to 10
+x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
+
+# Compute the mean of the array
+mean_x = np.mean(x)
+
+# Print the result
+print(mean_x)
+
## 5.5
+
In this example, we can see that there are several differences between R +and Python:
+Syntax: R uses the assignment operator \<- while Python uses the equals +sign = for variable assignment.
+Libraries: Python relies heavily on external libraries such as numpy, +pandas, and matplotlib for data analysis, while R has built-in functions +for many data analysis tasks.
+Data types: R is designed to work with vectors and matrices, while +Python uses lists and arrays. In the example above, we used the numpy +library to create a numerical array in Python.
+Function names: Function names in R and Python can differ significantly. +In the example above, we used the mean() function in R and the np.mean() +function in Python to calculate the mean of the vector/array.
+These are just a few of the many differences between R and Python. +Ultimately, the choice between the two languages will depend on your +specific needs and preferences.
+R Code:
+data("iris")
+here()
+load(file=here("2_R_and_Py_bilingualism", "data", "iris_example_data.rdata"))
+objects()
+
Python code:
+R Code:
+save(iris, file=here("2_R_and_Py_bilingualism", "data", "iris_example_data.rdata"))
+
+write.csv(iris, file=here("2_R_and_Py_bilingualism", "data", "iris_example_data.csv"))
+
Python code:
+Both R and Python are powerful languages for writing functions that can +take input, perform a specific task, and return output. R Code:
+# Define a function that takes two arguments and returns their sum
+sum_r <- function(a, b) {
+ return(a + b)
+}
+
+# Call the function with two arguments and print the result
+result_r <- sum_r(3, 5)
+print(result_r)
+
## [1] 8
+
Python code:
+# Define a function that takes two arguments and returns their sum
+def sum_py(a, b):
+ return a + b
+
+# Call the function with two arguments and print the result
+result_py = sum_py(3, 5)
+print(result_py)
+
## 8
+
In both cases, we define a function that takes two arguments and returns +their sum. In R, we use the function keyword to define a function, while +in Python, we use the def keyword. The function body in R is enclosed in +curly braces, while in Python it is indented.
+There are a few differences in the syntax and functionality between the +two approaches:
+Function arguments: In R, function arguments are separated by commas, +while in Python they are enclosed in parentheses. The syntax for +specifying default arguments and variable-length argument lists can also +differ between the two languages. Return statement: In R, we use the +return keyword to specify the return value of a function, while in +Python, we simply use the return statement. Function names: Function +names in R and Python can differ significantly. In the example above, we +used the sum_r() function in R and the sum_py() function in Python to +calculate the sum of two numbers.
+R Code:
+# Load the "ggplot2" package for plotting
+library(ggplot2)
+
+# Generate some sample data
+x <- seq(1, 10, 1)
+y <- x + rnorm(10)
+
+# Create a scatter plot
+ggplot(data.frame(x, y), aes(x = x, y = y)) +
+ geom_point()
+
+Python code:
# Load the "matplotlib" library
+import matplotlib.pyplot as plt
+
+# Generate some sample data
+import numpy as np
+x = np.arange(1, 11)
+y = x + np.random.normal(0, 1, 10)
+
+#clear last plot
+plt.clf()
+
+# Create a scatter plot
+plt.scatter(x, y)
+plt.show()
+
In both cases, we generate some sample data and create a scatter plot to +visualize the relationship between the variables.
+There are a few differences in the syntax and functionality between the +two approaches:
+Library and package names: In R, we use the ggplot2 package for +plotting, while in Python, we use the matplotlib library. Data format: +In R, we use a data frame to store the input data, while in Python, we +use numpy arrays. Plotting functions: In R, we use the ggplot() function +to create a new plot object, and then use the geom_point() function to +create a scatter plot layer. In Python, we use the scatter() function +from the matplotlib.pyplot module to create a scatter plot directly.
+R Code:
+# Load the "ggplot2" package for plotting
+library(ggplot2)
+
+# Generate some sample data
+x <- seq(1, 10, 1)
+y <- x + rnorm(10)
+
+# Perform linear regression
+model_r <- lm(y ~ x)
+
+# Print the model summary
+summary(model_r)
+
##
+## Call:
+## lm(formula = y ~ x)
+##
+## Residuals:
+## Min 1Q Median 3Q Max
+## -1.69344 -0.42336 0.08961 0.34778 1.56728
+##
+## Coefficients:
+## Estimate Std. Error t value Pr(>|t|)
+## (Intercept) -0.1676 0.6781 -0.247 0.811
+## x 0.9750 0.1093 8.921 1.98e-05 ***
+## ---
+## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+##
+## Residual standard error: 0.9926 on 8 degrees of freedom
+## Multiple R-squared: 0.9087, Adjusted R-squared: 0.8972
+## F-statistic: 79.59 on 1 and 8 DF, p-value: 1.976e-05
+
# Plot the data and regression line
+ggplot(data.frame(x, y), aes(x = x, y = y)) +
+ geom_point() +
+ geom_smooth(method = "lm", se = FALSE)
+
## `geom_smooth()` using formula = 'y ~ x'
+
Python code:
+# Load the "matplotlib" and "scikit-learn" libraries
+import matplotlib.pyplot as plt
+from sklearn.linear_model import LinearRegression
+
+# Generate some sample data
+import numpy as np
+x = np.arange(1, 11)
+y = x + np.random.normal(0, 1, 10)
+
+# Perform linear regression
+model_py = LinearRegression().fit(x.reshape(-1, 1), y)
+
+# Print the model coefficients
+print("Coefficients: ", model_py.coef_)
+
## Coefficients: [1.15539692]
+
print("Intercept: ", model_py.intercept_)
+
+#clear last plot
+
## Intercept: -1.1291396173221218
+
plt.clf()
+
+# Plot the data and regression line
+plt.scatter(x, y)
+plt.plot(x, model_py.predict(x.reshape(-1, 1)), color='red')
+plt.show()
+
In both cases, we generate some sample data with a linear relationship +between x and y, and then perform a simple linear regression to estimate +the slope and intercept of the line. We then plot the data and +regression line to visualize the fit.
+There are a few differences in the syntax and functionality between the +two approaches:
+Library and package names: In R, we use the lm() function from the base +package to perform linear regression, while in Python, we use the +LinearRegression() class from the scikit-learn library. Additionally, we +use the ggplot2 package in R for plotting, while we use the matplotlib +library in Python. Data format: In R, we can specify the dependent and +independent variables in the formula used for regression. In Python, we +need to reshape the input data to a two-dimensional array before fitting +the model. Model summary: In R, we can use the summary() function to +print a summary of the model, including the estimated coefficients, +standard errors, and p-values. In Python, we need to print the +coefficients and intercept separately.
+R Code:
+# Load the "randomForest" package
+library(randomForest)
+
+# Load the "iris" dataset
+data(iris)
+
+# Split the data into training and testing sets
+set.seed(123)
+train_idx <- sample(1:nrow(iris), nrow(iris) * 0.7, replace = FALSE)
+train_data <- iris[train_idx, ]
+test_data <- iris[-train_idx, ]
+
+# Build a random forest model
+rf_model <- randomForest(Species ~ ., data = train_data, ntree = 500)
+
+# Make predictions on the testing set
+predictions <- predict(rf_model, test_data)
+
+# Calculate accuracy of the model
+accuracy <- sum(predictions == test_data$Species) / nrow(test_data)
+print(paste("Accuracy:", accuracy))
+
## [1] "Accuracy: 0.977777777777778"
+
Python code:
+# Load the "pandas", "numpy", and "sklearn" libraries
+import pandas as pd
+import numpy as np
+from sklearn.ensemble import RandomForestClassifier
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+
+# Load the "iris" dataset
+iris = load_iris()
+
+# Split the data into training and testing sets
+X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=123)
+
+# Build a random forest model
+rf_model = RandomForestClassifier(n_estimators=500, random_state=123)
+rf_model.fit(X_train, y_train)
+
+# Make predictions on the testing set
+
## RandomForestClassifier(n_estimators=500, random_state=123)
+
predictions = rf_model.predict(X_test)
+
+# Calculate accuracy of the model
+accuracy = sum(predictions == y_test) / len(y_test)
+print("Accuracy:", accuracy)
+
## Accuracy: 0.9555555555555556
+
In both cases, we load the iris dataset and split it into training and +testing sets. We then build a random forest model using the training +data and evaluate its accuracy on the testing data.
+There are a few differences in the syntax and functionality between the +two approaches:
+Library and package names: In R, we use the randomForest package to +build random forest models, while in Python, we use the +RandomForestClassifier class from the sklearn.ensemble module. We also +use different libraries for loading and manipulating data (pandas and +numpy in Python, and built-in datasets in R). Model parameters: The +syntax for setting model parameters is slightly different in R and +Python. For example, in R, we specify the number of trees using the +ntree parameter, while in Python, we use the n_estimators parameter. +Data format: In R, we use a data frame to store the input data, while in +Python, we use numpy arrays.
+R Code:
+# Load the "osmdata" package for mapping
+library(osmdata)
+library(tmap)
+
+# Define the map location and zoom level
+bbox <- c(left = -0.16, bottom = 51.49, right = -0.13, top = 51.51)
+
+# Get the OpenStreetMap data
+osm_data <- opq(bbox) %>%
+ add_osm_feature(key = "highway") %>%
+ osmdata_sf()
+
+# Plot the map using tmap
+tm_shape(osm_data$osm_lines) +
+ tm_lines()
+
+Python code:
# Load the "osmnx" package for mapping
+import osmnx as ox
+
+# Define the map location and zoom level
+bbox = (51.49, -0.16, 51.51, -0.13)
+
+# Get the OpenStreetMap data
+osm_data = ox.graph_from_bbox(north=bbox[2], south=bbox[0], east=bbox[3], west=bbox[1], network_type='all')
+
+# Plot the map using osmnx
+ox.plot_graph(osm_data)
+
## (<Figure size 1600x1600 with 0 Axes>, <AxesSubplot:>)
+
In both cases, we define the map location and zoom level, retrieve the +OpenStreetMap data using the specified bounding box, and plot the map.
+The main differences between the two approaches are:
+Package names and syntax: In R, we use the osmdata package and its +syntax to download and process the OpenStreetMap data, while in Python, +we use the osmnx package and its syntax. Mapping libraries: In R, we use +the tmap package to create a static map of the OpenStreetMap data, while +in Python, we use the built-in ox.plot_graph function from the osmnx +package to plot the map.
+R Code:
+# Load the "keras" package for building the CNN
+library(tensorflow)
+library(keras)
+
+# Load the "raster" package for working with raster data
+library(raster)
+
+# Load the "magrittr" package for pipe operator
+library(magrittr)
+
+# Load the data as a raster brick
+raster_data <- brick("raster_data.tif")
+
+# Split the data into training and testing sets
+split_data <- sample(1:nlayers(raster_data), size = nlayers(raster_data)*0.8, replace = FALSE)
+train_data <- raster_data[[split_data]]
+test_data <- raster_data[[setdiff(1:nlayers(raster_data), split_data)]]
+
+# Define the CNN model
+model <- keras_model_sequential() %>%
+ layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = "relu", input_shape = c(ncol(train_data), nrow(train_data), ncell(train_data))) %>%
+ layer_max_pooling_2d(pool_size = c(2, 2)) %>%
+ layer_dropout(rate = 0.25) %>%
+ layer_flatten() %>%
+ layer_dense(units = 128, activation = "relu") %>%
+ layer_dropout(rate = 0.5) %>%
+ layer_dense(units = nlayers(train_data), activation = "softmax")
+
+# Compile the model
+model %>% compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = "accuracy")
+
+# Train the model
+history <- model %>% fit(x = array(train_data), y = to_categorical(1:nlayers(train_data)), epochs = 10, validation_split = 0.2)
+
+# Evaluate the model
+model %>% evaluate(x = array(test_data), y = to_categorical(1:nlayers(test_data)))
+
+# Plot the model accuracy over time
+plot(history)
+
Piping is a powerful feature in both R and Python that allows for a more +streamlined and readable code. However, the syntax for piping is +slightly different between the two languages.
+In R, piping is done using the %>% operator from the magrittr package, +while in Python, it is done using the | operator from the pandas +package.
+Let’s compare and contrast piping in R and Python with some examples:
+Piping in R In R, we can use the %>% operator to pipe output from one +function to another, which can make our code more readable and easier to +follow. Here’s an example:
+R code:
+library(dplyr)
+
+# create a data frame
+df <- data.frame(x = c(1,2,3), y = c(4,5,6))
+
+# calculate the sum of column x and y
+df %>%
+ mutate(z = x + y) %>%
+ summarize(sum_z = sum(z))
+
## sum_z
+## 1 21
+
In this example, we first create a data frame df with two columns x and +y. We then pipe the output of df to mutate, which adds a new column z to +the data frame that is the sum of x and y. Finally, we pipe the output +to summarize, which calculates the sum of z and returns the result.
+Piping in Python In Python, we can use the | operator to pipe output +from one function to another. However, instead of piping output from one +function to another, we pipe a DataFrame to a method of the DataFrame. +Here’s an example:
+Python code:
+import pandas as pd
+
+# create a DataFrame
+df = pd.DataFrame({'x': [1,2,3], 'y': [4,5,6]})
+
+# calculate the sum of column x and y
+(df.assign(z = df['x'] + df['y'])
+ .agg(sum_z = ('z', 'sum')))
+
## z
+## sum_z 21
+
In this example, we first create a DataFrame df with two columns x and +y. We then use the assign() method to add a new column z to the +DataFrame that is the sum of x and y. Finally, we use the agg() method +to calculate the sum of z and return the result.
+As we can see, the syntax for piping is slightly different between R and +Python, but the concept remains the same. Piping can make our code more +readable and easier to follow, which is an important aspect of creating +efficient and effective code.
+R code:
+library(dplyr)
+library(ggplot2)
+
+iris %>%
+ filter(Species == "setosa") %>%
+ group_by(Sepal.Width) %>%
+ summarise(mean.Petal.Length = mean(Petal.Length)) %>%
+ mutate(Sepal.Width = as.factor(Sepal.Width)) %>%
+ ggplot(aes(x = Sepal.Width, y = mean.Petal.Length)) +
+ geom_bar(stat = "identity", fill = "dodgerblue") +
+ labs(title = "Mean Petal Length of Setosa by Sepal Width",
+ x = "Sepal Width",
+ y = "Mean Petal Length")
+
In this example, we start with the iris dataset and filter it to only +include rows where the Species column is “setosa”. We then group the +remaining rows by the Sepal.Width column and calculate the mean +Petal.Length for each group. Next, we convert Sepal.Width to a factor +variable to ensure that it is treated as a categorical variable in the +visualization. Finally, we create a bar plot using ggplot2, with +Sepal.Width on the x-axis and mean.Petal.Length on the y-axis. The +resulting plot shows the mean petal length of setosa flowers for each +sepal width category.
+Python code:
+import pandas as pd
+
+# Load the iris dataset and pipe it into the next function
+( pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", header=None, names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
+
+ # Select columns and pivot the dataset
+ .loc[:, ['sepal_length', 'sepal_width', 'petal_length']]
+ .melt(var_name='variable', value_name='value')
+
+ # Group by variable and calculate mean
+ .groupby('variable', as_index=False)
+ .mean()
+
+ # Filter for mean greater than 3.5 and sort by descending mean
+ .query('value > 3.5')
+ .sort_values('value', ascending=False)
+)
+
## variable value
+## 1 sepal_length 5.843333
+## 0 petal_length 3.758667
+
Here is an example of a for loop in R:
+R code
+# Create a vector of numbers
+numbers <- c(1, 2, 3, 4, 5)
+
+# Use a for loop to print out each number in the vector
+for (i in numbers) {
+ print(i)
+}
+
## [1] 1
+## [1] 2
+## [1] 3
+## [1] 4
+## [1] 5
+
In this example, the for loop iterates over each element in the numbers +vector, assigning the current element to the variable i. The print(i) +statement is then executed for each iteration, outputting the value of +i.
+Here is the equivalent example in Python:
+Python code
+# Create a list of numbers
+numbers = [1, 2, 3, 4, 5]
+
+# Use a for loop to print out each number in the list
+for i in numbers:
+ print(i)
+
## 1
+## 2
+## 3
+## 4
+## 5
+
In Python, the for loop iterates over each element in the numbers list, +assigning the current element to the variable i. The print(i) statement +is then executed for each iteration, outputting the value of i.
+Both languages also support nested for loops, which can be used to +perform iterations over multiple dimensions, such as looping through a +2D array.
+Parallel computing is a technique used to execute multiple computational +tasks simultaneously, which can significantly reduce the time required +to complete a task. Both R and Python have built-in support for parallel +computing, although the approaches are slightly different. In this +answer, we will compare and contrast the parallel computing capabilities +of R and Python, and provide working examples in code.
+Parallel computing in R In R, there are several packages that support +parallel computing, such as parallel, foreach, and doParallel. The +parallel package provides basic functionality for parallel computing, +while foreach and doParallel provide higher-level abstractions that make +it easier to write parallel code.
+Here is an example of using the foreach package to execute a loop in +parallel:
+R code:
+library(foreach)
+library(doParallel)
+
+# Set up a parallel backend with 4 workers
+cl <- makeCluster(4)
+registerDoParallel(cl)
+
+# Define a function to apply in parallel
+myfunc <- function(x) {
+ # some computation here
+ return(x^2)
+}
+
+# Generate some data
+mydata <- 1:1000
+
+# Apply the function to the data in parallel
+result <- foreach(i = mydata) %dopar% {
+ myfunc(i)
+}
+
+# Stop the cluster
+stopCluster(cl)
+
In this example, we use the makeCluster() function to set up a cluster +with 4 workers, and the registerDoParallel() function to register the +cluster as the parallel backend for foreach. We then define a function +myfunc() that takes an input x and returns x^2. We generate some data +mydata and use foreach to apply myfunc() to each element of mydata in +parallel, using the %dopar% operator.
+R Tidyverse parallel
+In R Tidyverse, we can use the furrr package for parallel computing. +Here’s an example of using furrr to parallelize a map function:
+R Tidy code:
+library(tidyverse)
+library(furrr)
+
+# Generate a list of numbers
+numbers <- 1:10
+
+# Use the future_map function from furrr to parallelize the map function
+plan(multisession)
+squares <- future_map(numbers, function(x) x^2)
+
In this example, we first load the Tidyverse and furrr libraries. We +then generate a list of numbers from 1 to 10. We then use the plan +function to set the parallelization strategy to “multisession”, which +will use multiple CPU cores to execute the code. Finally, we use the +future_map function from furrr to apply the function x^2 to each number +in the list in parallel.
+Parallel computing in Python In Python, the standard library includes +the multiprocessing module, which provides basic support for parallel +computing. Additionally, there are several third-party packages that +provide higher-level abstractions, such as joblib and dask.
+Here is an example of using the multiprocessing module to execute a loop +in parallel:
+Python code:
+def square(x):
+ return x**2
+
+from multiprocessing import Pool
+
+# Generate a list of numbers
+numbers = list(range(1, 11))
+
+# Use the map function and a pool of workers to parallelize the square function
+with Pool() as pool:
+ squares = pool.map(square, numbers)
+
+print(squares)
+
In this example, we define a function myfunc() that takes an input x and +returns x^2. We generate some data mydata and use the Pool class from +the multiprocessing module to set up a pool of 4 workers. We then use +the map() method of the Pool class to apply myfunc() to each element of +mydata in parallel.
+Comparison and contrast Both R and Python have built-in support for +parallel computing, with similar basic functionality for creating and +managing parallel processes. However, the higher-level abstractions +differ between the two languages. In R, the foreach package provides a +high-level interface that makes it easy to write parallel code, while in +Python, the multiprocessing module provides a basic interface that can +be extended using third-party packages like joblib and dask.
+Additionally, Python has better support for distributed computing using +frameworks like Apache Spark, while R has better support for +shared-memory parallelism using tools like data.table and ff.
+Data wrangling is an important part of any data analysis project, and +both R and Python provide tools and libraries for performing this task. +In this answer, we will compare and contrast data wrangling in R’s +tidyverse and Python’s pandas library, with working examples in code.
+Data Wrangling in R Tidyverse
+The tidyverse is a collection of R packages designed for data science, +and it includes several packages that are useful for data wrangling. One +of the most popular packages is dplyr, which provides a grammar of data +manipulation for data frames.
+Here is an example of using dplyr to filter, mutate, and summarize a +data frame:
+R code
+library(dplyr)
+
+# Load data
+data(mtcars)
+
+# Filter for cars with more than 100 horsepower
+mtcars %>%
+ filter(hp > 100) %>%
+ # Add a new column with fuel efficiency in km per liter
+ mutate(kmpl = 0.425 * mpg) %>%
+ # Group by number of cylinders and summarize
+ group_by(cyl) %>%
+ summarize(mean_hp = mean(hp),
+ mean_kmpl = mean(kmpl))
+
## # A tibble: 3 × 3
+## cyl mean_hp mean_kmpl
+## <dbl> <dbl> <dbl>
+## 1 4 111 11.0
+## 2 6 122. 8.39
+## 3 8 209. 6.42
+
In this example, we first filter the mtcars data frame to only include +cars with more than 100 horsepower. We then use mutate to create a new +column with fuel efficiency in kilometers per liter. Finally, we group +the data by the number of cylinders and calculate the mean horsepower +and fuel efficiency.
+Data Wrangling in Python Pandas
+Pandas is a popular library for data manipulation in Python. It provides +a data frame object similar to R’s data frames, along with a wide range +of functions for data wrangling.
+Here is an example of using pandas to filter, transform, and group a +data frame:
+Python code:
+import pandas as pd
+
+# Load data
+mtcars = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/mtcars.csv')
+
+# Filter for cars with more than 100 horsepower
+filtered_mtcars = mtcars[mtcars['hp'] > 100]
+
+# Add a new column with fuel efficiency in km per liter
+filtered_mtcars['kmpl'] = 0.425 * filtered_mtcars['mpg']
+
+# Group by number of cylinders and calculate mean horsepower and fuel efficiency
+grouped_mtcars = filtered_mtcars.groupby('cyl').agg({'hp': 'mean',
+ 'kmpl': 'mean'})
+
In this example, we first load the mtcars data from a CSV file. We then +filter the data to only include cars with more than 100 horsepower, +using boolean indexing. We use the assign function to create a new +column with fuel efficiency in kilometers per liter. Finally, we group +the data by the number of cylinders and calculate the mean horsepower +and fuel efficiency.
+Comparison
+Overall, both R’s tidyverse and Python’s pandas provide similar +functionality for data wrangling. Both allow for filtering, +transforming, and aggregating data frames. The syntax for performing +these operations is slightly different between the two languages, with R +using the %>% operator for chaining operations and Python using method +chaining or the apply family of functions.
+One key difference between the two languages is that R’s tidyverse +provides a consistent grammar for data manipulation across its various +packages, making it easier to learn and use. However, Python’s pandas +library has a larger developer community and is more versatile for use +in other applications, such as web development or machine learning.
+In conclusion, both R and Python provide powerful tools for data +wrangling, and the choice between the two ultimately depends on the +specific needs of the user and their familiarity
+Retrieving data from an API is a common task in both R and Python. Here +are examples of how to retrieve data from an API in both languages:
+Python
+To retrieve data from an API in Python, we can use the requests library. +Here’s an example of how to retrieve weather data from the +OpenWeatherMap API:
+Python code:
+import requests
+
+url = 'https://api.openweathermap.org/data/2.5/weather?q=London,uk&appid=API_KEY'
+
+response = requests.get(url)
+
+data = response.json()
+
+print(data)
+
This code retrieves the current weather data for London from the +OpenWeatherMap API. We first construct the API URL with the location and +API key, then use the requests.get() function to make a request to the +API. We then extract the JSON data from the response using the .json() +method and print the resulting data.
+R
+In R, we can use the httr package to retrieve data from an API. Here’s +an example of how to retrieve weather data from the OpenWeatherMap API +in R:
+R code:
+library(httr)
+
+url <- 'https://api.openweathermap.org/data/2.5/weather?q=London,uk&appid=API_KEY'
+
+response <- GET(url)
+
+data <- content(response, 'text')
+
+print(data)
+
This code is similar to the Python code above. We first load the httr +library, then construct the API URL and use the GET() function to make a +request to the API. We then extract the data from the response using the +content() function and print the resulting data.
+Retrieving Data from an API in R Tidyverse In R Tidyverse, we can use +the httr and jsonlite packages to retrieve and process data from an API.
+R code:
+# Load required packages
+library(httr)
+library(jsonlite)
+
+# Define API endpoint
+endpoint <- "https://jsonplaceholder.typicode.com/posts"
+
+# Retrieve data from API
+response <- GET(endpoint)
+
+# Extract content from response
+content <- content(response, "text")
+
+# Convert content to JSON
+json <- fromJSON(content)
+
+# Convert JSON to a data frame
+df <- as.data.frame(json)
+
In the above example, we use the GET() function from the httr package to +retrieve data from an API endpoint, and the content() function to +extract the content of the response. We then use the fromJSON() function +from the jsonlite package to convert the JSON content to a list, and the +as.data.frame() function to convert the list to a data frame.
+Retrieving Data from an API in Python In Python, we can use the requests +library to retrieve data from an API, and the json library to process +the JSON data.
+Python code:
+# Load required libraries
+import requests
+import json
+
+# Define API endpoint
+endpoint = "https://jsonplaceholder.typicode.com/posts"
+
+# Retrieve data from API
+response = requests.get(endpoint)
+
+# Extract content from response
+content = response.content
+
+# Convert content to JSON
+json_data = json.loads(content)
+
+# Convert JSON to a list of dictionaries
+data = [dict(row) for row in json_data]
+
In the above example, we use the get() function from the requests +library to retrieve data from an API endpoint, and the content attribute +to extract the content of the response. We then use the loads() function +from the json library to convert the JSON content to a list of +dictionaries.
+Comparison Both R Tidyverse and Python provide powerful tools for +retrieving and processing data from an API. In terms of syntax, the two +languages are somewhat similar. In both cases, we use a library to +retrieve data from the API, extract the content of the response, and +then process the JSON data. However, there are some differences in the +specific functions and methods used. For example, in R Tidyverse, we use +the content() function to extract the content of the response, whereas +in Python, we use the content attribute. Additionally, in R Tidyverse, +we use the fromJSON() function to convert the JSON data to a list, +whereas in Python, we use the loads() function.
+Retrieving USA census data in R, R Tidy, and Python can be done using +different packages and libraries. Here are some working examples in code +for each language:
+R:
+To retrieve census data in R, we can use the tidycensus package. Here’s +an example of how to retrieve the total population for the state of +California:
+R code:
+library(tidycensus)
+library(tidyverse)
+
+# Set your Census API key
+census_api_key("your_api_key")
+
+# Get the total population for the state of California
+ca_pop <- get_acs(
+ geography = "state",
+ variables = "B01003_001",
+ state = "CA"
+) %>%
+ rename(total_population = estimate) %>%
+ select(total_population)
+
+# View the result
+ca_pop
+
R Tidy:
+To retrieve census data in R Tidy, we can also use the tidycensus +package. Here’s an example of how to retrieve the total population for +the state of California using pipes and dplyr functions:
+R tidy code:
+library(tidycensus)
+library(tidyverse)
+
+# Set your Census API key
+census_api_key("your_api_key")
+
+# Get the total population for the state of California
+ca_pop <- get_acs(
+ geography = "state",
+ variables = "B01003_001",
+ state = "CA"
+) %>%
+ rename(total_population = estimate) %>%
+ select(total_population)
+
+# View the result
+ca_pop
+
Python:
+To retrieve census data in Python, we can use the census library. Here’s +an example of how to retrieve the total population for the state of +California:
+Python code:
+from census import Census
+from us import states
+import pandas as pd
+
+# Set your Census API key
+c = Census("your_api_key")
+
+# Get the total population for the state of California
+ca_pop = c.acs5.state(("B01003_001"), states.CA.fips, year=2019)
+
+# Convert the result to a Pandas DataFrame
+ca_pop_df = pd.DataFrame(ca_pop)
+
+# Rename the column
+ca_pop_df = ca_pop_df.rename(columns={"B01003_001E": "total_population"})
+
+# Select only the total population column
+ca_pop_df = ca_pop_df[["total_population"]]
+
+# View the result
+ca_pop_df
+
To find Lidar data in R and Python, you typically need to start by +identifying sources of Lidar data and then accessing them using +appropriate packages and functions. Here are some examples of how to +find Lidar data in R and Python:
+R:
+Identify sources of Lidar data: The USGS National Map Viewer provides +access to Lidar data for the United States. You can also find Lidar data +on state and local government websites, as well as on commercial data +providers’ websites. Access the data: You can use the lidR package in R +to download and read Lidar data in the LAS format. For example, the +following code downloads and reads Lidar data for a specific area:
+R code:
+library(lidR)
+
+# Download Lidar data
+LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")
+lidar <- readLAS(LASfile)
+
+# Visualize the data
+plot(lidar)
+
Python:
+Identify sources of Lidar data: The USGS 3DEP program provides access to +Lidar data for the United States. You can also find Lidar data on state +and local government websites, as well as on commercial data providers’ +websites. Access the data: You can use the pylastools package in Python +to download and read Lidar data in the LAS format. For example, the +following code downloads and reads Lidar data for a specific area:
+Python code:
+py_install("requests")
+py_install("pylas")
+py_install("laspy")
+
import requests
+from pylas import read
+import laspy
+import numpy as np
+
+# Download Lidar data
+url = "https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_SanFrancisco_2016_LAS_2018.zip"
+lasfile = "USGS_LPC_CA_SanFrancisco_2016_LAS_2018.las"
+r = requests.get(url, allow_redirects=True)
+open(lasfile, 'wb').write(r.content)
+
+# Read the data
+lidar = read(lasfile)
+
+# Visualize the data
+laspy.plot.plot(lidar)
+
Data for Black Lives (https://d4bl.org/) is a movement that uses data +science to create measurable change in the lives of Black people. While +the Data for Black Lives website provides resources, reports, articles, +and datasets related to racial equity, it doesn’t provide a direct API +for downloading data.
+Instead, you can access the Data for Black Lives GitHub repository +(https://github.com/Data4BlackLives) to find datasets and resources to +work with. In this example, we’ll use a sample dataset available at +https://github.com/Data4BlackLives/covid-19/tree/master/data. The +dataset “COVID19_race_data.csv” contains COVID-19 race-related data.
+R: In R, we’ll use the ‘readr’ and ‘dplyr’ packages to read, process, +and analyze the dataset.
+R code:
+# Install and load necessary libraries
+
+library(readr)
+library(dplyr)
+
+# Read the CSV file
+url <- "https://raw.githubusercontent.com/Data4BlackLives/covid-19/master/data/COVID19_race_data.csv"
+data <- read_csv(url)
+
+# Basic information about the dataset
+print(dim(data))
+print(head(data))
+
+# Example analysis: calculate the mean of 'cases_total' by 'state'
+data %>%
+ group_by(state) %>%
+ summarize(mean_cases_total = mean(cases_total, na.rm = TRUE)) %>%
+ arrange(desc(mean_cases_total))
+
Python: In Python, we’ll use the ‘pandas’ library to read, process, and +analyze the dataset.
+Python code:
+import pandas as pd
+
+# Read the CSV file
+url = "https://raw.githubusercontent.com/Data4BlackLives/covid-19/master/data/COVID19_race_data.csv"
+data = pd.read_csv(url)
+
+# Basic information about the dataset
+print(data.shape)
+print(data.head())
+
+# Example analysis: calculate the mean of 'cases_total' by 'state'
+mean_cases_total = data.groupby("state")["cases_total"].mean().sort_values(ascending=False)
+print(mean_cases_total)
+
In conclusion, both R and Python provide powerful libraries and tools +for downloading, processing, and analyzing datasets, such as those found +in the Data for Black Lives repository. The ‘readr’ and ‘dplyr’ +libraries in R offer a simple and intuitive way to read and manipulate +data, while the ‘pandas’ library in Python offers similar functionality +with a different syntax. Depending on your preferred programming +language and environment, both options can be effective in working with +social justice datasets.
+The ProPublica Congress API provides information about the U.S. Congress +members and their voting records. In this example, we’ll fetch data +about the current Senate members and calculate the number of members in +each party.
+R: In R, we’ll use the ‘httr’ and ‘jsonlite’ packages to fetch and +process data from the ProPublica Congress API.
+R code:
+# load necessary libraries
+library(httr)
+library(jsonlite)
+
+# Replace 'your_api_key' with your ProPublica API key
+
+#
+
+# Fetch data about the current Senate members
+url <- "https://api.propublica.org/congress/v1/117/senate/members.json"
+response <- GET(url, add_headers(`X-API-Key` = api_key))
+
+# Check if the request was successful
+if (http_status(response)$category == "Success") {
+ data <- content(response, "parsed")
+ members <- data$results[[1]]$members
+
+ # Calculate the number of members in each party
+ party_counts <- table(sapply(members, function(x) x$party))
+ print(party_counts)
+} else {
+ print(http_status(response)$message)
+}
+
##
+## D I ID R
+## 49 1 2 51
+
Python: In Python, we’ll use the ‘requests’ library to fetch data from +the ProPublica Congress API and ‘pandas’ library to process the data.
+python code:
+# Install necessary libraries
+
+import requests
+import pandas as pd
+
+# Replace 'your_api_key' with your ProPublica API key
+api_key = "your_api_key"
+headers = {"X-API-Key": api_key}
+
+# Fetch data about the current Senate members
+url = "https://api.propublica.org/congress/v1/117/senate/members.json"
+response = requests.get(url, headers=headers)
+
+# Check if the request was successful
+if response.status_code == 200:
+ data = response.json()
+ members = data["results"][0]["members"]
+
+ # Calculate the number of members in each party
+ party_counts = pd.DataFrame(members)["party"].value_counts()
+ print(party_counts)
+else:
+ print(f"Error: {response.status_code}")
+
In conclusion, both R and Python offer efficient ways to fetch and +process data from APIs like the ProPublica Congress API. The ‘httr’ and +‘jsonlite’ libraries in R provide a straightforward way to make HTTP +requests and parse JSON data, while the ‘requests’ library in Python +offers similar functionality. The ‘pandas’ library in Python can be used +for data manipulation and analysis, and R provides built-in functions +like table() for aggregating data. Depending on your preferred +programming language and environment, both options can be effective for +working with the ProPublica Congress API.
+The Nonprofit Explorer API by ProPublica provides data on tax-exempt +organizations in the United States. In this example, we’ll search for +organizations with the keyword “education” and analyze the results.
+R: In R, we’ll use the ‘httr’ and ‘jsonlite’ packages to fetch and +process data from the Nonprofit Explorer API.
+R code:
+# Install and load necessary libraries
+library(httr)
+library(jsonlite)
+
+# Fetch data for organizations with the keyword "education"
+url <- "https://projects.propublica.org/nonprofits/api/v2/search.json?q=education"
+response <- GET(url)
+
+# Check if the request was successful
+if (http_status(response)$category == "Success") {
+ data <- content(response, "parsed")
+ organizations <- data$organizations
+
+ # Count the number of organizations per state
+ state_counts <- table(sapply(organizations, function(x) x$state))
+ print(state_counts)
+} else {
+ print(http_status(response)$message)
+}
+
##
+## AZ CA CO DC FL GA HI IL Indiana LA
+## 3 22 6 5 3 2 1 2 1 1
+## MD MI MN MO MP MS NC NE NJ NM
+## 1 2 5 3 1 1 2 2 2 1
+## NY OH OK Oregon PA TX UT VA WA WV
+## 1 5 1 2 2 12 1 4 3 1
+## ZZ
+## 2
+
Python: In Python, we’ll use the ‘requests’ library to fetch data from +the Nonprofit Explorer API and ‘pandas’ library to process the data.
+Python code:
+# Install necessary libraries
+import requests
+import pandas as pd
+
+# Fetch data for organizations with the keyword "education"
+url = "https://projects.propublica.org/nonprofits/api/v2/search.json?q=education"
+response = requests.get(url)
+
+# Check if the request was successful
+if response.status_code == 200:
+ data = response.json()
+ organizations = data["organizations"]
+
+ # Count the number of organizations per state
+ state_counts = pd.DataFrame(organizations)["state"].value_counts()
+ print(state_counts)
+else:
+ print(f"Error: {response.status_code}")
+
## CA 22
+## TX 12
+## CO 6
+## MN 5
+## OH 5
+## DC 5
+## VA 4
+## AZ 3
+## WA 3
+## MO 3
+## FL 3
+## IL 2
+## GA 2
+## NC 2
+## MI 2
+## Oregon 2
+## NE 2
+## ZZ 2
+## PA 2
+## NJ 2
+## HI 1
+## MS 1
+## NY 1
+## Indiana 1
+## NM 1
+## LA 1
+## UT 1
+## MD 1
+## MP 1
+## WV 1
+## OK 1
+## Name: state, dtype: int64
+
In conclusion, both R and Python offer efficient ways to fetch and +process data from APIs like the Nonprofit Explorer API. The ‘httr’ and +‘jsonlite’ libraries in R provide a straightforward way to make HTTP +requests and parse JSON data, while the ‘requests’ library in Python +offers similar functionality. The ‘pandas’ library in Python can be used +for data manipulation and analysis, and R provides built-in functions +like table() for aggregating data. Depending on your preferred +programming language and environment, both options can be effective for +working with the Nonprofit Explorer API.
+The Campaign Finance API by the Federal Election Commission (FEC) +provides data on campaign finance in U.S. federal elections. In this +example, we’ll fetch data about individual contributions for the 2020 +election cycle and analyze the results.
+R: In R, we’ll use the ‘httr’ and ‘jsonlite’ packages to fetch and +process data from the Campaign Finance API.
+R code:
+# Install and load necessary libraries
+library(httr)
+library(jsonlite)
+
+# Fetch data about individual contributions for the 2020 election cycle
+url <- "https://api.open.fec.gov/v1/schedules/schedule_a/?api_key='OGwpkX7tH5Jihs1qQcisKfVAMddJzmzouWKtKoby'&two_year_transaction_period=2020&sort_hide_null=false&sort_null_only=false&per_page=20&page=1"
+response <- GET(url)
+
+# Check if the request was successful
+if (http_status(response)$category == "Success") {
+ data <- content(response, "parsed")
+ contributions <- data$results
+
+ # Calculate the total contributions per state
+ state_totals <- aggregate(contributions$contributor_state, by = list(contributions$contributor_state), FUN = sum)
+ colnames(state_totals) <- c("State", "Total_Contributions")
+ print(state_totals)
+} else {
+ print(http_status(response)$message)
+}
+
## [1] "Client error: (403) Forbidden"
+
Python: In Python, we’ll use the ‘requests’ library to fetch data from +the Campaign Finance API and ‘pandas’ library to process the data.
+Python code:
+# Install necessary libraries
+
+import requests
+import pandas as pd
+
+# Fetch data about individual contributions for the 2020 election cycle
+url = "https://api.open.fec.gov/v1/schedules/schedule_a/?api_key=your_api_key&two_year_transaction_period=2020&sort_hide_null=false&sort_null_only=false&per_page=20&page=1"
+response = requests.get(url)
+
+# Check if the request was successful
+if response.status_code == 200:
+ data = response.json()
+ contributions = data["results"]
+
+ # Calculate the total contributions per state
+ df = pd.DataFrame(contributions)
+ state_totals = df.groupby("contributor_state")["contribution_receipt_amount"].sum()
+ print(state_totals)
+else:
+ print(f"Error: {response.status_code}")
+
## Error: 403
+
In conclusion, both R and Python offer efficient ways to fetch and +process data from APIs like the Campaign Finance API. The ‘httr’ and +‘jsonlite’ libraries in R provide a straightforward way to make HTTP +requests and parse JSON data, while the ‘requests’ library in Python +offers similar functionality. The ‘pandas’ library in Python can be used +for data manipulation and analysis, and R provides built-in functions +like aggregate() for aggregating data. Depending on your preferred +programming language and environment, both options can be effective for +working with the Campaign Finance API.
+Note: Remember to replace your_api_key with your actual FEC API key in +the code examples above.
+Historic redlining data refers to data from the Home Owners’ Loan +Corporation (HOLC) that created residential security maps in the 1930s, +which contributed to racial segregation and disinvestment in minority +neighborhoods. One popular source for this data is the Mapping +Inequality project (https://dsl.richmond.edu/panorama/redlining/).
+In this example, we’ll download historic redlining data for Philadelphia +in the form of a GeoJSON file and analyze the data in R and Python.
+R: In R, we’ll use the ‘sf’ and ‘dplyr’ packages to read and process the +GeoJSON data.
+R code:
+# Install and load necessary libraries
+library(sf)
+library(dplyr)
+
+# Download historic redlining data for Philadelphia
+url <- "https://dsl.richmond.edu/panorama/redlining/static/downloads/geojson/PAPhiladelphia1937.geojson"
+philly_geojson <- read_sf(url)
+
+# Count the number of areas per HOLC grade
+grade_counts <- philly_geojson %>%
+ group_by(holc_grade) %>%
+ summarize(count = n())
+
+plot(grade_counts)
+
Python: In Python, we’ll use the ‘geopandas’ library to read and process +the GeoJSON data.
+Python code:
+# Install necessary libraries
+
+
+import geopandas as gpd
+
+# Download historic redlining data for Philadelphia
+url = "https://dsl.richmond.edu/panorama/redlining/static/downloads/geojson/PAPhiladelphia1937.geojson"
+philly_geojson = gpd.read_file(url)
+
+# Count the number of areas per HOLC grade
+grade_counts = philly_geojson["holc_grade"].value_counts()
+print(grade_counts)
+
## B 28
+## D 26
+## C 18
+## A 10
+## Name: holc_grade, dtype: int64
+
In conclusion, both R and Python offer efficient ways to download and +process historic redlining data in the form of GeoJSON files. The ‘sf’ +package in R provides a simple way to read and manipulate spatial data, +while the ‘geopandas’ library in Python offers similar functionality. +The ‘dplyr’ package in R can be used for data manipulation and analysis, +and Python’s built-in functions like value_counts() can be used for +aggregating data. Depending on your preferred programming language and +environment, both options can be effective for working with historic +redlining data.
+In this example, we’ll download and analyze the American Indian and +Alaska Native Areas (AIANNH) TIGER/Line Shapefile from the U.S. Census +Bureau. We’ll download the data for the year 2020, and analyze the +number of AIANNH per congressional district
+R: In R, we’ll use the ‘sf’ and ‘dplyr’ packages to read and process the +Shapefile data.
+R code:
+# Install and load necessary libraries
+library(sf)
+library(dplyr)
+
+# Download historic redlining data for Philadelphia
+url <- "https://www2.census.gov/geo/tiger/TIGER2020/AIANNH/tl_2020_us_aiannh.zip"
+temp_file <- tempfile(fileext = ".zip")
+download.file(url, temp_file, mode = "wb")
+unzip(temp_file, exdir = tempdir())
+
+# Read the Shapefile
+shapefile_path <- file.path(tempdir(), "tl_2020_us_aiannh.shp")
+aiannh <- read_sf(shapefile_path)
+
+# Count the number of AIANNH per congressional district
+state_counts <- aiannh %>%
+ group_by(LSAD) %>%
+ summarize(count = n())
+
+print(state_counts[order(-state_counts$count),])
+
## Simple feature collection with 26 features and 2 fields
+## Geometry type: GEOMETRY
+## Dimension: XY
+## Bounding box: xmin: -174.236 ymin: 18.91069 xmax: -67.03552 ymax: 71.34019
+## Geodetic CRS: NAD83
+## # A tibble: 26 × 3
+## LSAD count geometry
+## <chr> <int> <MULTIPOLYGON [°]>
+## 1 79 221 (((-166.5331 65.33918, -166.5331 65.33906, -166.533 65.33699, -1…
+## 2 86 206 (((-83.38811 35.46645, -83.38342 35.46596, -83.38316 35.46593, -…
+## 3 OT 155 (((-92.32972 47.81374, -92.3297 47.81305, -92.32967 47.81196, -9…
+## 4 78 75 (((-155.729 20.02457, -155.7288 20.02428, -155.7288 20.02427, -1…
+## 5 85 46 (((-122.3355 37.95215, -122.3354 37.95206, -122.3352 37.95199, -…
+## 6 92 35 (((-93.01356 31.56287, -93.01354 31.56251, -93.01316 31.56019, -…
+## 7 88 25 (((-97.35299 36.908, -97.35291 36.90801, -97.35287 36.908, -97.3…
+## 8 96 19 (((-116.48 32.63814, -116.48 32.63718, -116.4794 32.63716, -116.…
+## 9 84 16 (((-105.5937 36.40379, -105.5937 36.40324, -105.5937 36.40251, -…
+## 10 89 11 (((-95.91705 41.28037, -95.91653 41.28036, -95.91653 41.28125, -…
+## # ℹ 16 more rows
+
Python: In Python, we’ll use the ‘geopandas’ library to read and process +the Shapefile data.
+Python code:
+import geopandas as gpd
+import pandas as pd
+import requests
+import zipfile
+import os
+from io import BytesIO
+
+# Download historic redlining data for Philadelphia
+url = "https://www2.census.gov/geo/tiger/TIGER2020/AIANNH/tl_2020_us_aiannh.zip"
+response = requests.get(url)
+zip_file = zipfile.ZipFile(BytesIO(response.content))
+
+# Extract Shapefile
+temp_dir = "temp"
+if not os.path.exists(temp_dir):
+ os.makedirs(temp_dir)
+
+zip_file.extractall(path=temp_dir)
+shapefile_path = os.path.join(temp_dir, "tl_2020_us_aiannh.shp")
+
+# Read the Shapefile
+aiannh = gpd.read_file(shapefile_path)
+
+# Count the number of AIANNH per congressional district
+state_counts = aiannh.groupby("LSAD").size().reset_index(name="count")
+
+# Sort by descending count
+state_counts_sorted = state_counts.sort_values(by="count", ascending=False)
+
+print(state_counts_sorted)
+
## LSAD count
+## 2 79 221
+## 9 86 206
+## 25 OT 155
+## 1 78 75
+## 8 85 46
+## 15 92 35
+## 11 88 25
+## 19 96 19
+## 7 84 16
+## 12 89 11
+## 5 82 8
+## 3 80 7
+## 4 81 6
+## 21 98 5
+## 20 97 5
+## 13 90 4
+## 18 95 3
+## 6 83 3
+## 17 94 2
+## 16 93 1
+## 14 91 1
+## 10 87 1
+## 22 99 1
+## 23 9C 1
+## 24 9D 1
+## 0 00 1
+
In conclusion, both R and Python offer efficient ways to download and +process AIANNH TIGER/Line Shapefile data from the U.S. Census Bureau. +The ‘sf’ package in R provides a simple way to read and manipulate +spatial data, while the ‘geopandas’ library in Python offers similar +functionality. The ‘dplyr’ package in R can be used for data +manipulation and analysis, and Python’s built-in functions like +value_counts() can be used for aggregating data. Depending on your +preferred programming language and environment, both options can be +effective for working with AIANNH data.
+The Bureau of Indian Affairs (BIA) provides a PDF document containing a +list of Indian Entities Recognized and Eligible To Receive Services. To +analyze the data, we’ll first need to extract the information from the +PDF. In this example, we’ll extract the names of the recognized tribes +and count the number of tribes per state.
+R: In R, we’ll use the ‘pdftools’ package to extract text from the PDF +and the ‘stringr’ package to process the text data.
+R code:
+# Install and load necessary libraries
+library(pdftools)
+library(stringr)
+library(dplyr)
+
+# Download the BIA PDF
+url <- "https://www.govinfo.gov/content/pkg/FR-2022-01-28/pdf/2022-01789.pdf"
+temp_file <- tempfile(fileext = ".pdf")
+download.file(url, temp_file, mode = "wb")
+
+# Extract text from the PDF
+pdf_text <- pdf_text(temp_file)
+tribe_text <- pdf_text[4:length(pdf_text)]
+
+# Define helper functions
+tribe_state_extractor <- function(text_line) {
+ regex_pattern <- "(.*),\\s+([A-Z]{2})$"
+ tribe_state <- str_match(text_line, regex_pattern)
+ return(tribe_state)
+}
+
+is_valid_tribe_line <- function(text_line) {
+ regex_pattern <- "^\\d+\\s+"
+ return(!is.na(str_match(text_line, regex_pattern)))
+}
+
+# Process text data to extract tribes and states
+tribe_states <- sapply(tribe_text, tribe_state_extractor)
+valid_lines <- sapply(tribe_text, is_valid_tribe_line)
+tribe_states <- tribe_states[valid_lines, 2:3]
+
+# Count the number of tribes per state
+tribe_data <- as.data.frame(tribe_states)
+colnames(tribe_data) <- c("Tribe", "State")
+state_counts <- tribe_data %>%
+ group_by(State) %>%
+ summarise(Count = n())
+
+print(state_counts)
+
## # A tibble: 0 × 2
+## # ℹ 2 variables: State <chr>, Count <int>
+
Python: In Python, we’ll use the ‘PyPDF2’ library to extract text from +the PDF and the ‘re’ module to process the text data.
+Python code:
+# Install necessary libraries
+import requests
+import PyPDF2
+import io
+import re
+from collections import Counter
+
+# Download the BIA PDF
+url = "https://www.bia.gov/sites/bia.gov/files/assets/public/raca/online-tribal-leaders-directory/tribal_leaders_2021-12-27.pdf"
+response = requests.get(url)
+
+# Extract text from the PDF
+pdf_reader = PyPDF2.PdfFileReader(io.BytesIO(response.content))
+tribe_text = [pdf_reader.getPage(i).extractText() for i in range(3, pdf_reader.numPages)]
+
+# Process text data to extract tribes and states
+tribes = [re.findall(r'^\d+\s+(.+),\s+([A-Z]{2})', line) for text in tribe_text for line in text.split('\n') if line]
+tribe_states = [state for tribe, state in tribes]
+
+# Count the number of tribes per state
+state_counts = Counter(tribe_states)
+print(state_counts)
+
In conclusion, both R and Python offer efficient ways to download and +process the list of Indian Entities Recognized and Eligible To Receive +Services from the BIA. The ‘pdftools’ package in R provides a simple way +to extract text from PDF files, while the ‘PyPDF2’ library in Python +offers similar functionality. The ‘stringr’ package in R and the ‘re’ +module in Python can be used to process and analyze text data. Depending +on your preferred programming language and environment, both options can +be effective for working with BIA data.
+In this example, we will download and analyze the National Atlas - +Indian Lands of the United States dataset in both R and Python. We will +read the dataset and count the number of Indian lands per state.
+R: In R, we’ll use the ‘sf’ package to read the Shapefile and the +‘dplyr’ package to process the data.
+R code:
+# Install and load necessary libraries
+
+library(sf)
+library(dplyr)
+
+# Download the Indian Lands dataset
+url <- "https://prd-tnm.s3.amazonaws.com/StagedProducts/Small-scale/data/Boundaries/indlanp010g.shp_nt00968.tar.gz"
+temp_file <- tempfile(fileext = ".tar.gz")
+download.file(url, temp_file, mode = "wb")
+untar(temp_file, exdir = tempdir())
+
+# Read the Shapefile
+shapefile_path <- file.path(tempdir(), "indlanp010g.shp")
+indian_lands <- read_sf(shapefile_path)
+
+# Count the number of Indian lands per state
+# state_counts <- indian_lands %>%
+# group_by(STATE) %>%
+# summarize(count = n())
+
+plot(indian_lands)
+
## Warning: plotting the first 9 out of 23 attributes; use max.plot = 23 to plot
+## all
+
Python: In Python, we’ll use the ‘geopandas’ and ‘pandas’ libraries to +read the Shapefile and process the data.
+Python code:
+import geopandas as gpd
+import pandas as pd
+import requests
+import tarfile
+import os
+from io import BytesIO
+
+# Download the Indian Lands dataset
+url = "https://prd-tnm.s3.amazonaws.com/StagedProducts/Small-scale/data/Boundaries/indlanp010g.shp_nt00966.tar.gz"
+response = requests.get(url)
+tar_file = tarfile.open(fileobj=BytesIO(response.content), mode='r:gz')
+
+# Extract Shapefile
+temp_dir = "temp"
+if not os.path.exists(temp_dir):
+ os.makedirs(temp_dir)
+
+tar_file.extractall(path=temp_dir)
+shapefile_path = os.path.join(temp_dir, "indlanp010g.shp")
+
+# Read the Shapefile
+indian_lands = gpd.read_file(shapefile_path)
+
+# Count the number of Indian lands per state
+state_counts = indian_lands.groupby("STATE").size().reset_index(name="count")
+
+print(state_counts)
+
Both R and Python codes download the dataset and read the Shapefile +using the respective packages. They then group the data by the ‘STATE’ +attribute and calculate the count of Indian lands per state.
+ +Environmental Data Science Innovation & Inclusion Lab (ESIIL) is committed to building, maintaining, and fostering an inclusive, kind, collaborative, and diverse transdisciplinary environmental data science community, whose members feel welcome, supported, and safe to contribute ideas and knowledge.
+The 2024 ESIIL Innovation Summit will follow all aspects of the ESIIL Code of Conduct (below).
+All community members are responsible for creating this culture, embodying our values, welcoming diverse perspectives and ways of knowing, creating safe inclusive spaces, and conducting ethical science as guided by FAIR (Findable, Accessible, Interoperable, Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) principles for scientific and Indigenous data management, governance, and stewardship.
+ESIIL’s vision is grounded in the conviction that innovation and breakthroughs in environmental data science will be precipitated by a diverse, collaborative, curious, and inclusive research community empowered by open data and infrastructure, cross-sector and community partnerships, team science, and engaged learning.
+As such, our core values center people through inclusion, kindness, respect, collaboration, and genuine relationships. They also center innovation, driven by collaborative, cross-sector science and synthesis, open, accessible data and tools, and fun, diverse teams. Finally, they center learning, propelled by curiosity and accessible, inclusive training, and education opportunities.
+These guidelines outline behavior expectations for ESIIL community members. Your participation in the ESIIL network is contingent upon following these guidelines in all ESIIL activities, including, but not limited to, participating in meetings, webinars, hackathons, working groups, hosted or funded by ESIIL, as well as email lists and online forums such as GutHub, Slack, and Twitter. These guidelines have been adapted from those of the International Arctic Research Policy Committee, the Geological Society of America, the American Geophysical Union, the University Corporation for Atmospheric Research, The Carpentries, and others. We encourage other organizations to adapt these guidelines for use in their own meetings.
+Note: Working groups and hackathon/codefest teams are encouraged to discuss these guidelines and what they mean to them, and will have the opportunity to add to them to specifically support and empower their team. Collaborative and behavior commitments complement data use, management, authorship, and access plans that commit to CARE and FAIR principles.
+ESIIL community members are expected to act professionally and respectfully in all activities, such that each person, regardless of gender, gender identity or expression, sexual orientation, disability, physical appearance, age, body size, race, religion, national origin, ethnicity, level of experience, language fluency, political affiliation, veteran status, pregnancy, country of origin, and any other characteristic protected under state or federal law, feels safe and welcome in our activities and community. We gain strength from diversity and actively seek participation from those who enhance it.
+In order to garner the benefits of a diverse community and to reach the full potential of our mission and charge, ESIIL participants must be allowed to develop a sense of belonging and trust within a respectful, inclusive, and collaborative culture. Guiding behaviors that contribute to this culture include, but are not limited to:
+Listen carefully – we each bring our own styles of communication, language, and ideas, and we must do our best to accept and accommodate differences. Do not interrupt when someone is speaking and maintain an open mind when others have different ideas than yours.
+Be present – when engaging with others, give them your full attention. If you need to respond to outside needs, please step away from the group quietly.
+Be kind – offer positive, supportive comments and constructive feedback. Critique ideas, not people. Harassment, discrimination, bullying, aggression, including offensive comments, jokes, and imagery, are unacceptable, regardless of intent, and will not be tolerated.
+Be punctual - adhere to the schedule provided by the organizers and avoid disruptive behavior during presentations, trainings, or working sessions.
+Respect privacy - be mindful of the confidentiality of others. Always obtain explicit consent before recording, sharing, or using someone else’s personal information, photos, or recordings.
+Practice good digital etiquette (netiquette) when communicating online, whether in emails, messages, or social media - think before posting online and consider the potential impact on others. Do not share or distribute content generated by or involving others without their explicit consent.
+Create space for everyone to participate – be thoughtful about who is at the table; openly address accessibility needs, and provide multiple ways to contribute.
+Be welcoming – ESIIL participants come from a wide range of skill levels and career stages, backgrounds, and cultures. Demonstrate that you value these different perspectives and identities through your words and actions, including through correct use of names, titles, and pronouns.
+Be self-aware – recognize that positionality, identity, unconscious biases, and upbringing can all affect how words and behaviors are perceived. Ensure that your words and behavior make others feel welcome.
+Commit to ongoing learning – the move toward inclusive, equitable, and just environmental data science is a collective journey. Continue to learn about and apply practices of inclusion, anti-racism, bystander intervention, and cultural sensitivity. None of us is perfect; all of us will, from time to time, fail to live up to our own high standards. Being perfect is not what matters; owning our mistakes and committing to clear and persistent efforts to grow and improve is.
+Check your presumptions – we each bring our own ideas and assumptions about how the world should and does work – what are yours, and how do they affect how you interact with others? How do they shape your perception of new ideas?
+Ask questions – one of the strengths of interdisciplinary and diverse teams is that we all bring different knowledge and viewpoints; no one person is expected to know everything. So don’t be afraid to ask, to learn, and to share.
+Be bold – significant innovations don’t come from incremental efforts. Be brave in proposing and testing new ideas. When things don’t work, learn from the experience.
+Invite feedback – new ideas and improvements can emerge from many places when we’re open to hearing them. Check your defensiveness and listen; accept feedback as a gift toward improving our work and ourselves.
+Recognize that everyone is bringing something different to the table – take the time to get to know each other. Keep an open mind, encourage ideas that are different from yours, and learn from each other’s expertise and experience.
+Be accountable - great team science depends on trust, communication, respect, and delivering on your commitments. Be clear about your needs, as both a requester and a responder, realistic about your time and capacity commitments, and communicate timelines and standards in advance.
+Make assumptions explicit and provide context wherever possible - misunderstandings are common on transdisciplinary and cross-cultural teams and can best be managed with intentionality. Check in about assumptions, and be willing to share and correct misunderstandings or mistakes when they happen. Make use of collaboration agreements, communicate clearly and avoid jargon wherever possible.
+Respect intellectual property and Indigenous data sovereignty – ESIIL recognizes the extractive and abusive history of scientific engagement with Native peoples, and is committed to doing better. Indigenous knowledge holders are under no obligation to share their data, stories or knowledge. Their work should always be credited, and only shared with permission. Follow guidelines for authorship, Indigenous data sovereignty, and CARE principles. Acknowledge and credit the ideas and work of others.
+Use the resources that we provide - take advantage of the cyberinfrastructure and data cube at your disposal, but do not use them for unrelated tasks, as it could disrupt the event, introduce security risks, undermine the spirit of collaboration and fair play, and erode trust within the event community.
+Be safe - never share sensitive personal information; use strong passwords for your Cyverse and GitHub accounts and do not share them with other participants; be cautious of unsolicited emails, messages, or links; and verify online contacts. If you encounter any illegal or harmful activities online related to this event, report them to Tyler McIntosh or Susan Sullivan.
+Finally, speak up if you experience or notice a dangerous situation, or someone in distress!
+We adopt the full Code of Conduct of our home institution, the University of Colorado, details of which are found here. To summarize, examples of unacceptable and reportable behaviors include, but are not limited to:
+The University of Colorado recognizes all Federal and State protected classes, which include the following: race, color, national origin, sex, pregnancy, age, marital status, disability, creed, religion, sexual orientation, gender identity, gender expression, veteran status, political affiliation or political philosophy. Mistreatment or harassment not related to protected class also has a negative impact and will be addressed by the ESIIL team.
+Anyone requested to stop unacceptable behavior is expected to comply immediately.
+If there is a clear violation of the code of conduct during an ESIIL event—for example, a meeting is Zoom bombed or a team member is verbally abusing another participant during a workshop— ESIIL leaders, facilitators (or their designee) or campus/local police may take any action deemed necessary and appropriate, including expelling the violator, or immediate removal of the violator from any online or in-person event or platform without warning or refund. If such actions are necessary, there will be follow up with the ESIIL Diversity Equity and Inclusion (DEI) team to determine what further action is needed (see Reporting Process and Consequences below).
+For smaller incidents that might be settled with a brief conversation, you may choose to contact the person in question or set up a (video) conversation to discuss how the behavior affected you. Please use this approach only if you feel comfortable; you do not have to carry the weight of addressing these issues yourself. If you are interested in this option but unsure how to go about it, please contact the ESIIL DEI lead, Susan Sullivan, first—she will have advice on how to make the conversation happen and is available to join you in a conversation as requested.
+We take any reports of Code of Conduct violations seriously, and aim to support those who are impacted and ensure that problematic behavior doesn’t happen again.
+If you believe you’re experiencing or have experienced unacceptable behavior that is counter to this code of conduct, or you are witness to this behavior happening to someone else, we encourage you to contact our DEI lead:
+You may also choose to anonymously report behavior to ESIIL using this form.
+The DEI team will keep reports as confidential as possible. However, as mandatory reporters, we have an obligation to report alleged protected class violations to our home institution or to law enforcement.
+When we discuss incidents with people who are accused of misconduct (the respondent), we will anonymize details as much as possible to protect the privacy of the reporter and the person who was impacted (the complainant). In some cases, even when the details are anonymized, the respondent may guess at the identities of the reporter and complainants. If you have concerns about retaliation or your personal safety, please let us know (or note that in your report). We encourage you to report in any case, so that we can support you while keeping ESIIL members safe. In some cases, we are able to compile several anonymized reports into a pattern of behavior, and take action based on that pattern.
+If you prefer to speak with someone who is not on the ESIIL leadership team, or who can maintain confidentiality, you may contact:
+If you want more information about when to report, or how to help someone who needs to report, please review the resources at Don’t Ignore It.
+Note: The reporting party does not need to be directly involved in a code of conduct violation incident. Please make a bystander report if you observe a potentially dangerous situation, someone in distress, or violations of these guidelines, even if the situation is not happening to you.
+After a member of the ESIIL DEI team takes your report, they will (if necessary) consult with the appropriate support people at CU. The ESIIL DEI team will respond with a status update within 5 business days.
+During this time, they, or members of the CU Office of Institutional Equity and Compliance, will:
+For significant infractions, follow up to the report may be turned over to the CU Office of Institutional Equity and Compliance and/or campus police.
+What follows are examples of possible responses to an incident report. This list is not inclusive, and ESIIL reserves the right to take any action it deems necessary. Generally speaking, the strongest response ESIIL may take is to completely ban a user from further engagement with ESIIL activities and, as is required, report a person to the CU Office of Institutional Equity and Compliance and/or their home institution and NSF. If law enforcement should be involved, they will recommend that the complainant make that contact. Employees of CU Boulder may also be subject to consequences as determined by the institution.
+In addition to the responses above, ESIIL responses may include but are not limited to the following:
+Do you need more resources?
+Please don’t hesitate to contact the ESIIL DEI lead, Susan Sullivan, if you have questions or +concerns.
+The CU Office of Institutional Equity and Compliance is a resource for all of us in navigating this space. They also offer resource materials that can assist you in exploring various topics and skills here.
+If you have questions about what, when or how to report, or how to help someone else with +concerns, Don’t Ignore It.
+CU Ombud’s Office: Confidential support to navigate university situations. (Most universities +have these resources)
+The CU Office of Victims Assistance (counseling limited to CU students/staff/faculty, though +advocacy is open to everyone engaged with a CU-sponsored activity. Please look for a similar resource on your campus if you are from another institution).
+National Crisis Hotlines
+How are we doing?
+Despite our best intentions, in some cases we may not be living up to our ideals of a positive, +supportive, inclusive, respectful and collaborative community. If you feel we could do better, we welcome your feedback. Comments, suggestions and praise are also very welcome! +Acknowledgment +By participating in this event, you agree to abide by this code of conduct and understand the consequences of violating it. We believe that a respectful and inclusive environment benefits all participants and leads to more creative and successful outcomes. +Thank you for your cooperation in making the this event a welcoming event for all. Have fun!
+ +~/data-store/data/iplant/home/shared/earthlab/forest_carbon_codefest/
/home/jovyan/data-store
.innovation-summit-utils
for SSH connection to GitHub.conda install -c conda-forge openssh
in the terminal if encountering errors.https://<id>.cyverse.run/rstudio/auth-sign-in
https://<id>.cyverse.run/rstudio/
put
for upload and get
for download.This Participant Agreement (“Agreement”) is a contract between you (“You/Your” or “Participant”) and THE REGENTS OF THE UNIVERSITY OF COLORADO, a body corporate, acting on behalf of the University of Colorado Boulder, a public institution of higher education created under the Constitution and the Law of the State of Colorado (the “University”), having offices located at 3100 Marine Street, Boulder, CO 80309.
+In consideration of Your participation in the 2024 ESIIL Innovation Summit, the sufficiency of which is hereby acknowledged, You agree as follows:
+Environmental Data Science Innovation & Inclusion Lab (“ESIIL”) is a National Science Foundation (“NSF”) funded data synthesis center led by the University. Earth Lab is part of the Cooperative Institute for Research in Environmental Sciences (CIRES) specializing in data-intensive open, reproducible environmental science. ESIIL will host the Summit in person from May 13 through May 16, 2024.
+ESIIL's 2024 Innovation Summit will offer an opportunity to use big data to understand resilience across genes, species, ecosystems and societies, advance ecological forecasting with solutions in mind, and inform adaptive management and natural climate solutions. The Summit will support attendees to advance data-informed courses of action for resilience and adaptation in the face of our changing environment. It will be an in-person ‘unconference’, enabling participants to dynamically work on themes that most inspire them, with inclusive physical and intellectual spaces for working together. Over two and a half days participants will work in teams to explore research questions using open science approaches, including: data infrastructure, artificial intelligence (AI) and novel analytics, and cloud computing. Participants will be encouraged to work across and respect different perspectives, with the aim of co-developing resilience solutions. ESIIL will provide participants with opportunities to learn more about cultural intelligence, ethical and open science practices, and leadership in the rapidly evolving field of environmental data science. Overall, the Summit will capitalize on the combination of open data and analytics opportunities to develop innovative or impactful approaches that improve environmental resilience and adaptation.
+You will join a team of environmental scientists, data experts, and coders to explore curated data, consider the objectivity of the data, propose a scientific question that can be addressed with all or some of the data sets, and analyze the data in an attempt to answer your scientific question. You will present your Work to the event community. ESIIL will provide environmental data, cyberinfrastructure, cyberinfrastructure and data analytics training, and technical support.
+By and through Your participation in the Summit, You represent and warrant the following:
+By participating in the Innovation Summit, You may receive access to certain datasets, webinars, and/or other copyrighted materials (collectively, the “Summit Assets”). You agree to follow all licenses, restrictions, and other instructions provided to You with the Summit Assets.
+The Summit Assets are provided “as is” without warranty of any kind, either express or implied, including, without limitation, any implied warranties of merchantability and fitness for a particular purpose. Without limiting the foregoing, the University does not warrant that the Materials will be suitable for Your Solution or that the operation or supply of the Summit Assets will be uninterrupted or error free.
+You agree not to access or use the Summit Assets in a manner that may interfere with any other participants’ or users’ use of such assets, unless provided with express written consent by the University. Your access to and use of the Summit Assets may be limited, throttled, or terminated at any time at the sole discretion of the University.
+You represent that Your Work is Your original creation. If you obtain permission to include third-party materials, You represent that Your Work includes complete details of any third-party license or other restriction (including, but not limited to, related patents and trademarks) of which You are aware and which are associated with any part of Your Work. You represent and warrant that You will not submit any materials to the University that You know or believe to have components that are malicious or harmful. You represent that You will perform a reasonable amount of due diligence in order to be properly informed of third-party licenses, infringing materials, or harmful content associated with any part of Your Work.
+You agree to make Your Work publicly available in GitHub under the MIT open-source license within five (5) months from the end of the Summit.
+TO THE EXTENT ALLOWED BY LAW, IN NO EVENT SHALL THE UNIVERSITY, ITS PARTNERS, LICENSORS, SERVICE PROVIDERS, OR ANY OF THEIR RESPECTIVE OFFICERS, DIRECTORS, AGENTS, EMPLOYEES OR REPRESENTATIVES, BE LIABLE FOR DIRECT, INCIDENTAL, CONSEQUENTIAL, EXEMPLARY OR PUNITIVE DAMAGES ARISING OUT OF OR IN CONNECTION WITH THE SUMMIT OR THIS AGREEMENT (HOWEVER ARISING, INCLUDING NEGLIGENCE). IF YOU HAVE A DISPUTE WITH ANY PARTICIPANT OR ANY OTHER THIRD PARTY, YOU RELEASE THE UNIVERSITY, ITS, PARTNERS, LICENSORS, AND SERVICE PROVIDERS, AND EACH OF THEIR RESPECTIVE OFFICERS, DIRECTORS, AGENTS, EMPLOYEES AND REPRESENTATIVES FROM ANY AND ALL CLAIMS, DEMANDS AND DAMAGES (ACTUAL AND CONSEQUENTIAL) OF EVERY KIND AND NATURE ARISING OUT OF OR IN ANY WAY CONNECTED WITH SUCH DISPUTES. YOU AGREE THAT ANY CLAIMS AGAINST UNIVERSITY ARISING OUT OF THE SUMMIT OR THIS AGREEMENT MUST BE FILED WITHIN ONE YEAR AFTER SUCH CLAIM AROSE; OTHERWISE, YOUR CLAIM IS PERMANENTLY BARRED.
+Under no circumstances will Your participation in the Summit or anything in this Agreement be construed as an offer or contract of employment with the University.
+This Agreement and the Summit shall be governed and construed in accordance with and governed by the laws of the state of Colorado without giving effect to conflict of law provisions.
+This Agreement and the Event Code of Conduct, constitutes the entire agreement between the University and You with respect to the Summit and supersedes all previous or contemporaneous oral or written agreements concerning the Summit. In the event of a conflict between this Agreement and/or the Event Code of Conduct, the conflict shall be resolved with the following order of precedence:
+The invalidity, illegality, or unenforceability of any one or more phrases, sentences, clauses, or sections in this Agreement does not affect the remaining portions of this Agreement.
+If you have questions about the Summit, please contact ESIIL at esiil@colorado.edu.
+ESIIL Guidelines for Intellectual Contributions and Credit
+ +Big Data for Environmental Resilience and Adaptation
+Date: May 13-16, 2024
+Location: SEEC Auditorium, University of Colorado Boulder
+Summit Website
Time | +Event | +Location | +
---|---|---|
9:00 AM - 12:00 PM MDT | +Leadership Program | +S372 (Viz Studio) | +
9:00 AM - 12:00 PM MDT | +Auditorium Set Up: Tables, Questions, Handouts, etc. | +SEEC Auditorium | +
12:00 - 1:00 PM MDT | +Facilitators Lunch | ++ |
1:00 or 1:30 PM MDT | +Concurrent Optional Activities | +NEON Tour, HIKE | +
3:00 - 5:00 PM MDT | +Early Registration opens | +SEEC Atrium | +
3:00 - 4:00 PM MDT | +Technical Help Desk | +SEEC Auditorium | +
4:00 - 6:00 PM MDT | +Social Mixer | +SEEC Cafe | +
Time | +Event | +Location | +
---|---|---|
8:30 AM MDT | +Registration | +SEEC Atrium | +
9:00 AM MDT | +Welcome & Opening Ceremony | +SEEC Auditorium | +
9:35 AM MDT | +Logistics and Planning Team Introductions | +SEEC Auditorium | +
9:45 AM MDT | +Positive Polarities | +SEEC Auditorium | +
10:00 AM MDT | +Navigating Miscommunications | +SEEC Auditorium | +
10:15 AM MDT | +Creating a shared language | +SEEC Auditorium | +
10:30 AM MDT | +Break | +SEEC Atrium | +
10:45 AM MDT | +Science of Team Science | +SEEC Auditorium | +
11:05 AM MDT | +Big Data for Resilience | +SEEC Auditorium | +
11:45 AM MDT | +Q&A | +SEEC Auditorium | +
12:15 PM MDT | +Group Photo | +SEEC Atrium | +
12:30 PM MDT | +Lunch | +SEEC Atrium | +
1:30 PM MDT | +Leveraging NEON to Understand Ecosystem Resilience Across Scales | +SEEC Auditorium | +
1:45 PM MDT | +Explore Topics in Resilience and Adaptation | +SEEC Auditorium | +
3:15 PM MDT | +Break | +SEEC Atrium | +
3:30 PM MDT | +Team Breakouts: Innovation Time | +Rooms available: S124, S127, S221, etc. | +
4:20 PM MDT | +Report Back | +SEEC Auditorium | +
4:50 PM MDT | +Whole Group Reflection | +SEEC Auditorium | +
4:55 PM MDT | +Day 1 Evaluation | +SEEC Auditorium | +
5:00 PM MDT | +Day 1 Close | +SEEC Auditorium | +
Time | +Event | +Location | +
---|---|---|
8:30 AM MDT | +Coffee & Tea | +SEEC Atrium | +
9:00 AM MDT | +Welcome Back | +SEEC Auditorium | +
9:20 AM MDT | +AI Research for Climate Change and Environmental Sustainability | +SEEC Auditorium | +
9:35 PM MDT | +Prepare for the day | +SEEC Auditorium | +
9:50 AM MDT | +Team Breakouts: Innovation Time | +Breakout Spaces with your Team | +
12:30 PM MDT | +Lunch | +SEEC Atrium | +
1:30 PM MDT | +Working Through the Groan Zone | +SEEC Auditorium | +
1:50 PM MDT | +Team Breakouts: Innovation Time | +Breakout Spaces with your Team | +
4:10 PM MDT | +Report Back | +SEEC Auditorium | +
4:50 PM MDT | +Whole Group Reflection | +SEEC Auditorium | +
5:00 PM MDT | +Day 2 Close | ++ |
Time | +Event | +Location | +
---|---|---|
8:30 AM MDT | +Coffee & Tea | +SEEC Atrium | +
9:00 AM MDT | +Welcome Back | +SEEC Auditorium | +
9:15 AM MDT | +Final Team Breakout: Prepare for the Final Report Back | +Breakout Spaces with your Team | +
9:45 AM MDT | +Final Break | +SEEC Atrium | +
10:00 AM MDT | +Final Report back | +SEEC Auditorium | +
11:20 AM MDT | +What’s Next? | +SEEC Auditorium | +
11:35 AM MDT | +Final Reflection | +SEEC Auditorium | +
11:50 PM MDT | +Closing | +SEEC Auditorium | +
\n {translation(\"search.result.term.missing\")}: {...missing}\n
\n }\n