new

Get trending papers in your email inbox!

Subscribe

byAK and the research community

Mar 17

Brilla AI: AI Contestant for the National Science and Maths Quiz

The African continent lacks enough qualified teachers which hampers the provision of adequate learning support. An AI could potentially augment the efforts of the limited number of teachers, leading to better learning outcomes. Towards that end, this work describes and evaluates the first key output for the NSMQ AI Grand Challenge, which proposes a robust, real-world benchmark for such an AI: "Build an AI to compete live in Ghana's National Science and Maths Quiz (NSMQ) competition and win - performing better than the best contestants in all rounds and stages of the competition". The NSMQ is an annual live science and mathematics competition for senior secondary school students in Ghana in which 3 teams of 2 students compete by answering questions across biology, chemistry, physics, and math in 5 rounds over 5 progressive stages until a winning team is crowned for that year. In this work, we built Brilla AI, an AI contestant that we deployed to unofficially compete remotely and live in the Riddles round of the 2023 NSMQ Grand Finale, the first of its kind in the 30-year history of the competition. Brilla AI is currently available as a web app that livestreams the Riddles round of the contest, and runs 4 machine learning systems: (1) speech to text (2) question extraction (3) question answering and (4) text to speech that work together in real-time to quickly and accurately provide an answer, and then say it with a Ghanaian accent. In its debut, our AI answered one of the 4 riddles ahead of the 3 human contesting teams, unofficially placing second (tied). Improvements and extensions of this AI could potentially be deployed to offer science tutoring to students and eventually enable millions across Africa to have one-on-one learning interactions, democratizing science education.

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses

Scientific discovery contributes largely to human society's prosperity, and recent progress shows that LLMs could potentially catalyze this process. However, it is still unclear whether LLMs can discover novel and valid hypotheses in chemistry. In this work, we investigate this central research question: Can LLMs automatically discover novel and valid chemistry research hypotheses given only a chemistry research background (consisting of a research question and/or a background survey), without limitation on the domain of the research question? After extensive discussions with chemistry experts, we propose an assumption that a majority of chemistry hypotheses can be resulted from a research background and several inspirations. With this key insight, we break the central question into three smaller fundamental questions. In brief, they are: (1) given a background question, whether LLMs can retrieve good inspirations; (2) with background and inspirations, whether LLMs can lead to hypothesis; and (3) whether LLMs can identify good hypotheses to rank them higher. To investigate these questions, we construct a benchmark consisting of 51 chemistry papers published in Nature, Science, or a similar level in 2024 (all papers are only available online since 2024). Every paper is divided by chemistry PhD students into three components: background, inspirations, and hypothesis. The goal is to rediscover the hypothesis, given only the background and a large randomly selected chemistry literature corpus consisting the ground truth inspiration papers, with LLMs trained with data up to 2023. We also develop an LLM-based multi-agent framework that leverages the assumption, consisting of three stages reflecting the three smaller questions. The proposed method can rediscover many hypotheses with very high similarity with the ground truth ones, covering the main innovations.

CfA3: 185 Type Ia Supernova Light Curves from the CfA

We present multi-band photometry of 185 type-Ia supernovae (SN Ia), with over 11500 observations. These were acquired between 2001 and 2008 at the F. L. Whipple Observatory of the Harvard-Smithsonian Center for Astrophysics (CfA). This sample contains the largest number of homogeneously-observed and reduced nearby SN Ia (z < 0.08) published to date. It more than doubles the nearby sample, bringing SN Ia cosmology to the point where systematic uncertainties dominate. Our natural system photometry has a precision of 0.02 mag or better in BVRIr'i' and roughly 0.04 mag in U for points brighter than 17.5 mag. We also estimate a systematic uncertainty of 0.03 mag in our SN Ia standard system BVRIr'i' photometry and 0.07 mag for U. Comparisons of our standard system photometry with published SN Ia light curves and comparison stars, where available for the same SN, reveal agreement at the level of a few hundredths mag in most cases. We find that 1991bg-like SN Ia are sufficiently distinct from other SN Ia in their color and light-curve-shape/luminosity relation that they should be treated separately in light-curve/distance fitter training samples. The CfA3 sample will contribute to the development of better light-curve/distance fitters, particularly in the few dozen cases where near-infrared photometry has been obtained and, together, can help disentangle host-galaxy reddening from intrinsic supernova color, reducing the systematic uncertainty in SN Ia distances due to dust.

Solar System Elemental Abundances from the Solar Photosphere and CI-Chondrites

Solar photospheric abundances and CI-chondrite compositions are reviewed and updated to obtain representative solar system abundances of the elements and their isotopes. The new photospheric abundances obtained here lead to higher solar metallicity. Full 3D NLTE photospheric analyses are only available for 11 elements. A quality index for analyses is introduced. For several elements, uncertainties remain large. Protosolar mass fractions are H (X = 0.7060), He (Y = 0.2753), and for metals Li to U (Z = 0.0187). The protosolar (C+N)/H agrees within 13% with the ratio for the solar core from the Borexino experiment. Elemental abundances in CI-chondrites were screened by analytical methods, sample sizes, and evaluated using concentration frequency distributions. Aqueously mobile elements (e.g., alkalis, alkaline earths, etc.) often deviate from normal distributions indicating mobilization and/or sequestration into carbonates, phosphates, and sulfates. Revised CI-chondrite abundances of non-volatile elements are similar to earlier estimates. The moderately volatile elements F and Sb are higher than before, as are C, Br and I, whereas the CI-abundances of Hg and N are now significantly lower. The solar system nuclide distribution curves of s-process elements agree within 4% with s-process predictions of Galactic chemical evolution models. P-process nuclide distributions are assessed. No obvious correlation of CI-chondritic to solar elemental abundance ratios with condensation temperatures is observed, nor is there one for ratios of CI-chondrites/solar wind abundances.

Gas dynamics around a Jupiter mass planet: II. Chemical evolution of circumplanetary material

In an ongoing effort to understand planet formation the link between the chemistry of the protoplanetary disk and the properties of resulting planets have long been a subject of interest. These connections have generally been made between mature planets and young protoplanetary disks through the carbon-to-oxygen (C/O) ratio. In a rare number of systems, young protoplanets have been found within their natal protoplanetary disks. These systems offer a unique opportunity to directly study the delivery of gas from the protoplanetary disk to the planet. In this work we post-process 3D numerical simulations of an embedded Jupiter-massed planet in its protoplanetary disk to explore the chemical evolution of gas as it flows from the disk to the planet. The relevant dust to this chemical evolution is assumed to be small, co-moving grains with a reduced dust-to-gas ratio indicative of the upper atmosphere of a protoplanetary disk. We find that as the gas enters deep into the planet's gravitational well, it warms significantly (up to sim 800 K), releasing all of the volatile content from the ice phase. This change in phase can influence our understanding of the delivery of volatile species to the atmospheres of giant planets. The primary carbon, oxygen, and sulfur carrying ices: CO_2, H_2O, and H_2S are released into the gas phase and along with the warm gas temperatures near the embedded planets lead to the production of unique species like CS, SO, and SO_2 compared to the protoplanetary disk. We compute the column densities of SO, SO_2, CS, and H_2CS in our model and find that their values are consistent with previous observational studies.

Projections of Earth's Technosphere: Luminosity and Mass as Limits to Growth

Earth remains the only known example of a planet with technology, and future projections of Earth's trajectory provide a basis and motivation for approaching the search for extraterrestrial technospheres. Conventional approaches toward projecting Earth's technosphere include applications of the Kardashev scale, which suggest the possibility that energy-intensive civilizations may expand to harness the entire energy output available to their planet, host star, or even the entire galaxy. In this study, we argue that the Kardashev scale is better understood as a "luminosity limit" that describes the maximum capacity for a civilization to harvest luminous stellar energy across a given spatial domain, and we note that thermodynamic efficiency will always keep a luminosity-limited technosphere from actually reaching this theoretical limit. We suggest the possibility that an advanced technosphere might evolve beyond this luminosity limit to draw its energy directly from harvesting stellar mass, and we also discuss possible trajectories that could exist between Earth today and such hypothetical "stellivores." We develop a framework to describe trajectories for long-lived technospheres that optimize their growth strategies between exploration and exploitation, unlike Earth today. We note that analyses of compact accreting stars could provide ways to test the stellivore hypothesis, and we more broadly suggest an expansion of technosignature search strategies beyond those that reside exactly at the luminosity limit.

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reasoning ability and interpretability of an AI system. However, existing datasets fail to provide annotations for the answers, or are restricted to the textual-only modality, small scales, and limited domain diversity. To this end, we present Science Question Answering (ScienceQA), a new benchmark that consists of ~21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions. ScienceQA demonstrates the utility of CoT in language models, as CoT improves the question answering performance by 1.20% in few-shot GPT-3 and 3.99% in fine-tuned UnifiedQA. We also explore the upper bound for models to leverage explanations by feeding those in the input; we observe that it improves the few-shot performance of GPT-3 by 18.96%. Our analysis further shows that language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40% of the data. The data and code are available at https://scienceqa.github.io.