While Earth System Models often fail to reproduce biological fields like phytoplankton biomass and chlorophyll, the reasons behind such failures are complex. Because phytoplankton growth rates depend on environmental conditions like nutrients and light, and these in turn depend on the rates of mixing and upwelling, physical biases in models can produce biases in circulation such that a “perfect” biological model will still give imperfect results. For example, an Earth System Model in which the relationship between macronutrients and biomass matches that found in the real North Atlantic will still produce a spatial bias in the distribution of nutrients if the path of the North Atlantic Current is poorly simulated. If we knew that the model had the correct relationship between biomass and nutrients, we could unambiguously tie such an error to model physics. However, the actual apparent relationships (those seen in the real world between environmental drivers and phytoplankton biomass) are far from simple and may deviate from intrinsic relationships based on bench science which are often coded into models. For example, low phytoplankton biomass may be associated with low levels of nutrients in the presence of high levels of light, or high levels of nutrients in the presence of low levels of light. Simply plotting biomass against nutrients will then result in a maximum biomass concentration at intermediate levels of nutrient, capturing the asymptote of biomass at high levels of nutrient may require careful extrapolation. Better constraining the drivers of phytoplankton change and variability is essential to NOAA’s mission to improve the prediction of the Earth System in order to build resilience to changes. The proposal directly addresses the call within the competition to “examine biases in observed and modeled data/products and advance understanding of the causes for large differences between observed and modeled ocean data/products.” We aim to build on recent work showing that machine learning methods (in particular, Neural Network Ensembles) can be used to extract biologically reasonable complex relationships from ESMs and also used to compare the similarity of the biological codes across models. We propose to examine whether such methods can find robust relationships between biomass and observed environmental parameters on regional and global scales, and use the resulting relationships as metrics for evaluating Earth System Model output. We will do this using combinations of remotely sensed data (chlorophyll, carbon biovolume) and in-situ data (phytoplankton biomass, nutrients, Ekman upwelling, light, mixed layer depth). We will also develop a toolkit whereby Earth System Models that are part of the current IPCC process can be compared with observational relationships and to each other.