Large earthquakes, especially those occurring in a city or population centers, create devastation and havoc, and often times kindle several deaths and injuries, and significant infrastructure damage that lead to several billions of dollars in losses. Marine earthquakes are the leading cause of large tsunamis which cause deaths, destruction, displacement of population, and a possible nuclear meltdown. Thus, prediction of earthquake or its aftershocks or earthquake early warning system has a great potential to mitigate the loss of life as well as different kinds of damage. Earthquake prediction would mean forecasting the occurrence of an earthquake by providing both its magnitude estimate and accurate location. Earthquake prediction has been an important area of seismology research for quite a while, and it looks like it will continue to be an important area of research. Recently, with the implementation of deep learning in seismology, scientists have been able to detect, predict, and model seismic waves and earthquake aftershocks. Earthquake aftershocks are generally triggered by changes in stress formed by large earthquakes that happen within, or surrounding a given fault network system. The main goal of this study is to investigate the improvement of aftershock pattern predictions with the implementation of tuning and optimizing of deep learning parameters. To achieve these goals, we have developed an algorithm that can help first gather mainshock-aftershock sequence data. Some of the criteria used in identifying earthquakes that initiate an aftershock is to look at earthquakes that happen within a certain radius, the values we attempted are within about 0.5 degrees range, and within a certain period, from few seconds to several weeks of the occurrence of the main shock. For the sequence identification, we have been using seismic data from the United States Geological Survey (USGS)-National Earthquake Information Center (NEIC). We are also looking at different open-source data gathered by researchers for a similar study. The deep neural networks we are implementing make use of Keras python Toolkit, and Theano and Tensorflow libraries, with a plan to use PyTorch python library instead of Theano library in the future because of some maintenance issues. To this point our attempts have shown a good progress.
Mount Padang archeological site has been known since late nineteen century as a megalithic complex that sits on top. Our studies proves that the structure does not cover just the top but also wrap around the slopes covering about 15 ha area at least. Comprehensive geophysical surveys combining ground penetration radar (GPR) and multi-channel resistivity methods, seismic tomography augmented by bore-holes coring data and archeological excavations, show further that the structures are not only superficial but rooted into greater depth. The structures are not built at once but consisting several layers from consecutive periods. The uppermost layer on the surface consists of horizontal piles of basaltic columnar rocks forming step-structure terraces and decorated by exotic arrangements of stand-up rock columns forming walls, paths and spaces. The second layer, which had been previously misinterpreted as natural rock formation, buried 1-3 meters beneath the ground surface, is a several-meter thick fills consisting of more compact and advance arrangement of similar columnar rocks in fine-grain matrix. The third layer is also artificial arrangement of rock fragments with various kinds that extent down to about 15 meter deep. The third layer sits on fractured, massive basaltic lava tongue. The survey also reveals evidences of large underground cavities or chambers. Results of preliminary radiocarbon dating indicates that the first layer was built around Cal BP 3,000. The second layer was built around about Cal BP 7,000. The third layer was built prior to Cal.BP. 9,500, and could be as old as Cal.BP.13,000 to 28,000 years old.
The true bottleneck of artificial intelligence (AI) is not access to the data, but rather labeling this data. We have tons of raw agriculture image data coming from various sources and manual labelling remains to be a crucial step to keep the data well organized which requires considerable amount of time, money, and labor. This process can be made more efficient if we can automatically label the raw data. We propose contrastive learning representations for agriculture images (AgCLR) model that uses self-supervised representation learning approach on unlabeled real-world agriculture field data, to learn the useful image feature representations from the images. Contrastive learning is a self-supervised approach that enables model to learn attributes by contrasting samples against each other without the use of labels. AgCLR leverages the state-of-the-art SimCLRv2 framework to learn representations by maximizing the agreement between differently augmented views of same sample. We have incorporated critical enablers like mixed precision, multi-GPU distributed parallel computing, and use of Google Cloud's Tensor Processing Units (TPU) for optimizing the training process. We achieved 80.2% accuracy while classifying the test data. We further applied AgCLR to unrelated task to determine the alleys and rows in corn field videos for corn phenotyping and we observed two cluster formations for alleys and rows when plotted embeddings in a 3-dimensional space. We also developed a content-based image retrieval tool (pixel affinity) to identify similar images in our database and results were visually very promising.
Seismology is a data-driven science with a huge amount of data gathered for over a century. Though seismic data recording started in 1900, the growth of seismic data has obviously been exponentially in the last three decades. This data growth can be easily noticed if one takes a close look at just one of the largest seismological data centers in the US, the Data Management Center (DMC) of the Integrated Research Institutions in Seismology (IRIS). Data at the DMC grew from less than 10 Tebibytes in 1992 to about 800 Tebibytes in 2022. With the availability of such a large amount of seismic data, it is paramount to develop new seismic data processing and management tools to help analyze and find new and better seismic models. Developing new big seismic data processing and management tools will be helpful to make the best use of such growing big seismological data sets. The main goal of this investigation is the development of efficient data manipulation and processing tools for retrieval, processing, merging, aggregation, and management of big seismic data from disparate data sources. In this study, such big seismic data processing tools are being developed using python programming language and open source python libraries, and the tools we are developing will be helpful to extract, split, and convert, merge and process big seismic data. In addition, python is very suitable for data science and has powerful libraries to process and manage data and applications. Significant contributions have been made in recent python based libraries for seismic data processing, though there are still some rooms for improvement when it comes to seismic applications to merge, convert, manage and process big seismic data from disparate data sources and converting different file formats. Seismic data from different networks surrounding the Rio Grande Valley have been collected from different data sources. Our attempt is to test the developed tools and evaluate their performance. This study has made important progress in this regard and the results are promising.
Automated monitoring and evaluation systems for plant phenotyping are one of the keys to advance and strengthen crop breeding programs. In this study, the improvements of the camera-based sensor system and a weather station from a previous study-assembled mainly from Raspberry Pi products-board with dual cameras (RGB and NoIR) providing high spatial and temporal resolution data-is outlined. Hardware for the internet connection and the power supply system of the sensor were upgraded. Previously, the sensor could automatically capture plant images following user-defined time points; thus, an image processing algorithm (edge computing) was developed and installed to extract digital phenotypic traits from the images after capturing process. With the development, the new sensor system could be integrated with the internet, and a cloud server was configured to store data online (digital traits and raw images). A real-time monitoring system was created to visualize the time series data of a trait development and plant images throughout the season. With such a system, plant breeders will be able to monitor multiple trials for timely crop management and decision-making process, which is also resources efficiency.
Imaging of plants using multi-camera arrays in high-density growth environments is a strategy for affordable high-throughput phenotyping. In multi-camera systems, simultaneous imaging of hundreds to thousands of plants eliminates the time delay in measurements between plants seen in plant-to-camera or camera-to-plant systems, which allows for the analysis of plant growth, development, and environmental responses at a high temporal resolution. On the other hand, high plant density, camera-to-camera variation, and other trade-offs increase the complexity of data analysis. Here we present two recent updates to the PlantCV image analysis package to improve usability when working with multi-plant datasets. First, we introduce a method to automate detection of plants organized in a grid layout, reducing the need to make separate workflows for each camera in a multi-camera system. Second, we reduced the number of input and output parameters for functions handling the shape and location of plants and introduce automatic iteration over multiple objects of interest (e.g. plants), reducing the level of programming needed to build workflows.
The Amazon rainforest has a great influence on the global energy balance and carbon fluxes, responsible for the net removal of approximately 4 million tons of carbon per year, via photosynthetic activity. Climate change and deforestation have impacts on the carbon budget in Amazonia, transforming CO2 sink areas into sources. Given the complexity of the factors that govern the carbon exchange in the Amazon and its influence on biological processes, the use of Data science strategies can promote a better understanding about the main environmental factors for different scenarios, and also, assist in public policies to mitigate the global warming effects. This study aims to identify the environmental factors that determine the temporal variability of carbon exchanges between the biosphere and the atmosphere in the Tapajós National Forest, in the Amazon, applying Data Science strategies in an integrated set of environmental data from energy and carbon fluxes and remote sensing data. The specific objective is to assess the influence of a selected set of environmental variables on the variability of carbon exchanges, with the use of an artificial neural networks classification model to identify the variables with great impact on source, sink and neutrality scenarios in Tapajós National Forest. Data Science strategies were applied to an integrated dataset of ground-based carbon flux measurements and remote sensing data, considering the period between 2002 and 2006. An artificial neural network (ANN) classification model was developed to identify the environmental variables with great impact on carbon source, sink and neutrality conditions. The average global score of ANN model was 65%. It was possible to identify the predictor variables with greatest impact to the carbon sink condition: radiation at the top of the atmosphere, sensible and latent energy fluxes and leaf area index. Thus, the ANN model with an ensemble of Data Science strategies can improve a better understanding of variability CO2 fluxes and be a powerful tool to promote new knowledge.
The slope mass rating (SMR) method is universally used for the characterization and classification of rock slopes. SMR is calculated by reducing the value of basic rock mass rating (RMRb) by subtracting three adjustment factors F1, F2, and F3 based on the geometrical relationship between the slope and discontinuity and adding one adjustment factor F4 depending upon the excavation method used. These adjustment factors (F1, F2, and F3) are mathematical functions (continuous/discrete) that require post-processing of field data on a computer for their derivation. Less work has been done to develop the charts for the direct calculation of SMR in the field. In this paper, SMR charts are developed for the onsite classification of rock slopes. With the aid of SMR charts, an engineering geologist can easily assess the onsite SMR class of rock slopes by plotting discontinuity dip amount (plunge amount in case of wedge failure) and strike parallelism between slope dip direction and discontinuity dip direction (slope dip direction and trend direction of intersection line in case of wedge failure). Using SMR charts for any project (open-pit mines, road cut slopes, natural slopes, etc.) onsite suggestions of proper remedial and preventive measures for the rock slopes can be given, which accelerates the overall preliminary slope mass classification process. The proposed SMR charts are straightforward to use and can be adopted as useful tools for the preliminary rock slope stability assessment.
Although adequately detailed kerosene chemical-combustion Arrhenius reaction-rate suites were not readily available for combustion modeling until ca. the 1990’s (e.g., Marinov ), it was already known from mass-spectrometer measurements during the early Apollo era that fuel-rich liquid oxygen + kerosene (RP-1) gas generators yield large quantities (e.g., several percent of total fuel flows) of complex hydrocarbons such as benzene, butadiene, toluene, anthracene, fluoranthene, etc. (Thompson ), which are formed concomitantly with soot (Pugmire ). By the 1960’s, virtually every fuel-oxidizer combination for liquid-fueled rocket engines had been tested, and the impact of gas phase combustion-efficiency governing the rocket-nozzle efficiency factor had been empirically well-determined (Clark ). Up until relatively recently, spacelaunch and orbital-transfer engines were increasingly designed for high efficiency, to maximize orbital parameters while minimizing fuels and structural masses: Preburners and high-energy atomization have been used to pre-gasify fuels to increase (gas-phase) combustion efficiency, decreasing the yield of complex/aromatic hydrocarbons (which limit rocket-nozzle efficiency and overall engine efficiency) in hydrocarbon-fueled engine exhausts, thereby maximizing system launch and orbital-maneuver capability (Clark; Sutton; Sutton/Yang). The rocket combustion community has been aware that the choice of Arrhenius reaction-rate suite is critical to computer engine-model outputs. Specific combustion suites are required to estimate the yield of high-molecular-weight/reactive/toxic hydrocarbons in the rocket engine combustion chamber, nonetheless such GIGO errors can be seen in recent documents. Low-efficiency launch vehicles (SpaceX, Hanwha) therefore also need larger fuels loads to achieve the same launched/transferred mass, further increasing the yield of complex hydrocarbons and radicals deposited by low-efficiency rocket engines along launch trajectories and into the stratospheric ozone layer, the mesosphere, and above. With increasing launch rates from low-efficiency systems, these persistent (Ross/Sheaffer ; Sheaffer ), reactive chemical species must have a growing impact on critical, poorly-understood upper-atmosphere chemistry systems.
Natural methane gas release from the seafloor is a widespread phenomenon that occurs at cold seeps along most continental margins. Since their discovery in the early 1980s, seeps have been the focus of intensive research, partly aimed to refine the global carbon budget. However, deep-sea research is challenging and expensive and, to date, few works have successfully monitored the variability of methane gas release over long time periods (> 1 yr). Long-term monitoring is necessary to study the mechanisms that control seabed gas release. The M³ project, funded by the German Ministry of Education and Research, aims to study the temporal and spatial variability of gas emissions at the Southern Hydrate Ridge (SHR) by acoustically monitoring and quantifying gas effluxes over several years. Located 850 m deep on the Cascadia accretionary prism offshore Oregon, the SHR is one of the most studied seep sites and persistent but variable gas release has been observed for more than 20 years. Since 2015, the Ocean Observatories Initiative’s (OOI) Cabled Array observatory, provides power supply and two-way communication to the SHR, making it an ideal site for continuous long-term monitoring work. In this work, we present how we will take advantage of the OOI infrastructure and deploy several instruments on the seabed for at least 1.5 year. A multi-beam “overview” sonar mounted on a rotor will identify every gas bubble stream located within 200 m from the sonar location. A scanning “quantification” sonar will be used to estimate the amount of gas that is released from discrete gas streams. A camera system and a CTD probe will help process and analyze the hydro-acoustic data. All instruments will be powered and controlled from land through the OOI infrastructure. We present the instrument design, the operation protocol, as well as the data processing steps and expected results.
An organism's phenome results from expression of its genome (nature) under certain environment and management effects (nurture) and interactions between these factors, as well as measurement error. For over 30 years, DNA sequencing and genomics tools advanced to where it's now feasible to saturate genomes of segregating individuals, such that polymorphisms at nearly any position can be determined from other known positions. This is due to structure, linkage disequilibrium (LD), or linkage and is a powerful tool for genomic prediction and investigating biological phenomena. In contrast, most phenomics to date focuses on automating previously known "traits" as measurable and interpretable phenotypes; akin to focusing on measuring a single DNA marker rather than measuring an entire saturated genome. Viewing phenomics as a platform for discovery, similar to genomics, opens new methods for capturing phenomena in nature and nurture. Saturating a phenome would mean that an individual's fitness, performance, responses to environment and/or specific phenotypes could be accurately predicted in untested environments. To date, our experience with phenomic prediction for cumulative, complex phenotypes such as grain yield suggests it's possible to predict organismal performance in untested environments, possibly better than genomic methods despite less advanced tools and data. Factors limiting to saturating a phenome are evaluating enough individuals and environments, but more importantly, tools and methods to extract or "sequence" more phenomic features. Successfully saturating phenomes will impact every aspect of science and society, in biological disciplines from germplasm curators, physiologists to breeders, to education, the courtroom and policy.
From preventing scurvy to being part of religious rituals, citrus are intrinsically connected to human health and perception. From tiny mandarins to head-sized pummelos, citrus capability of hybridization provides a vastly diverse array of fruit sizes and shapes, which in turn corresponds to a diversity of flavors and aromas. These sensory qualities are tightly linked to oil glands in the citrus skin. The oil glands are also key to understanding fruit development, and the essential oils contained by them are fundamental in the food and perfume industries. We study the shape of citrus based on 3D X-ray CT scan reconstruction of 163 different citrus samples comprising 58 different species and cultivars, including samples of all fundamental citrus species. First, using the power of X-rays and image processing, we are able to compare and contrast size ratios between different tissues, such as the size of the skin compared to the rind or the flesh. Second, we model the fruit shape as an ellipsoidal surface, and later we study and infer possible oil gland distributions on this surface using principles of directional statistics. We finally compare and contrast these overall fruit shape models along their gland distributions across different citrus species. This morphological modeling will allow us later to link genotype with phenotype, furthering our insight on how the physical shape is genetically specified in DNA.
Being in an arid zone that is frequently submitted to high winds, south-central Arizona regularly gets impacted by several blowing dust events or dust storms every year. Major consequences of these events are visibility impairment and ensuing road traffic accidents, and a variety of health issues induced by inhalation of polluted air loaded with fine particulate matter produced by wind erosion. Despite such problems, and thus a need for guidance on mitigation efforts, studies dealing with dust source attribution for the region are largely missing. Furthermore, existing dust models exhibit large uncertainties and deficiencies in simulating dust events, rendering them of limited use in attribution studies or early warning systems. Therefore, to address some of these model issues, we have developed a high-resolution (1 km) dust modeling system by building upon an existing modeling framework consisting of Weather Research and Forecasting (WRF), FENGSHA (a dust emission model), and Community Multiscale Air Quality (CMAQ) models. In addition to incorporating new representations in the dust emission scheme, including roughness correction factor, sandblasting efficiency, and dust source mask, we implemented, in the dust model, up-to-date and very high-resolution data on land use, soil texture, and vegetation index. We used the revised dust modeling system to simulate a springtime dust storm (08–09 April 2013) of relatively long duration that caused a regional traffic incident involving minor injuries. The model simulations compared reasonably well against observations of concentration of particulate matter with a diameter of 10 μm and smaller (PM₁₀) and satellite-derived dust optical depth and vertical profile of aerosol subtypes. Interestingly, simulation results revealed that the anthropogenic (cropland) dust sources contributed more than half (~53 % or 260 µg/m³) of total PM₁₀, during the dust storm, over the region including Phoenix and western Pinal County. Contrary to the conventional wisdom that desert is the main dust source, our findings for this region challenge such belief and suggest that the regional air quality modeling over dryland regions should emphasize an improved representation of dust from agricultural lands as well, especially during high wind episodes. Such representations have the potential to inform decision-making in order to reduce windblown dust-related hazards on public health and safety.
The automatic, and accurate plant phenotyping plays important role to improve the crop yield through enabling efficient plant analysis and plant breeding studies. The 3d deep learning allows automatic segmentation of plant parts from point cloud data. However, the network architecture is designed manually and performance is limited to prior experience. The aim of this study is to search for optimal 3d deep networks to perform the plant part segmentation. We perform the 3d neural architecture search by training a super network composed of candidate networks. Using the trained super network, the evolutionary searching is used to search for top performing architecture. The results demonstrate the searched architecture outperforms manually designed architectures by attaining mean IoU and accuracy of more than 90% and 96%, respectively. The searched architecture achieves more than 83% class-wise IoU for all main stem, branches, and boll class. This plant part segmentation method shows promising results and holds potential to be utilized by plant breeders for enhancing the production quality.
Unmanned aerial vehicle (UAV)-based imagery has become widely used in collecting agronomic traits, enabling a much greater volume of data to be generated in a time-series manner. As one of the cutting-edge imagery analysis tools, machine learning-based object detection provides automated techniques to analyze these imagery data. In our previous study, UAVs have been used to collect aerial photography for field trials of 233 diverse inbred lines, grown under different nitrogen treatments. Images were collected during different plant developmental stages throughout the growing season. This dataset of images has here been used in developing machine learning techniques to obtain automated tassel counts at the plot level through the season. To improve detection accuracy, we have developed an image segmentation method to remove non-tassel pixels and then feed these filtered images into machine learning algorithms. As a result, our method showed a significant improvement in the accuracy of maize tassel detection. This method can be used in future research to produce time-series counts of tassels at the plot level, and will allow for accurate estimates of flowering-related traits, such as the earliest detected flowering date and the duration of each plot's flowering period. This phenotypic data and the trait-associated genes provide new opportunities for crop improvement and to facilitate future plant breeding.
High-oil tobacco varieties have been recently engineered to produce increased leaf oil content for future food and fuel needs. An engineered variety of Nicotiana tabacum produces ~30 percent of leaf dry weight in lipids in the form of triacylglycerol (TAG), a significant increase relative to the less than 1 percent storage oil normally found in wild-type leaves. This high-oil tobacco also accumulates oil bodies in stomatal guard cells. In order to understand the impact of oil on guard cell shape, aperture, and dynamics, we have co-opted computer vision tools in PlantCV to create an accurate, flexible, and high-throughput method for microscopy image analysis of stomata. To this end, leaf impressions are made with silicone putty; clear nail polish peels of the putty impressions are imaged using light microscopy. Binary thresholding followed by point-and-click regions of interest and morphology calculations provide stomatal counts, aperture, and other shape characteristics. Applying this method to high-oil tobacco demonstrated reduced stomatal aperture but the same number of stomata per unit leaf area, providing a mechanistic explanation of high-oil tobacco responses to high temperature and water deficit stresses.
We introduce a computer program for time series tuning and analysis. The well known in paleo-climatic community program AnalySeries after Paillard et al. is restricted to the Mac OS (32-bit) and, according to Apple plans to move entirely onto the 64-bit system, it will not be supported in the future upgrades of Macintosh OS. QAnalySeries is an attempt to re-implement the major functionality of AnalySeries thus providing the community with a useful tool. QAnalySeries is written using Qt SDK as a free software and can be run on Macintosh, Windows and Linux systems. Paillard, D., L. Labeyrie and P. Yiou (1996), Macintosh program performs time-series analysis, Eos Trans. AGU, 77: 379.
We investigate the robustness of 3D point-based deep learning for organ segmentation of 3D plant models against varying reconstruction quality of the surface. The reconstruction quality is quantified in two ways: 1) The number of acquisitions for partial 3D scans and 2) the amount of noise. High quality models of real rosebush plants are used to collect point clouds in a controlled simulation environment as a way to degrade surface quality systematically. We show that the well-known 3D point-based neural network PointNet++ is capable of operating effectively on low quality and corrupted data for the task of plant organ segmentation. The results indicate that investing on developing deep learning methods has the potential of advancing applications of automated phenotyping, especially for low-quality 3D point clouds of plants. Keywords: plant phenotyping, organ segmentation, robustness analysis, point-based deep learning (a) (b) Figure 1: A 3D rosebush model from ROSE-X data set: (a) point cloud; (b) triangular mesh model.