Research tracks
1.
Psychometrics applied to LLMs
Behavioral evaluation of LLMs under structured experimental conditions: assigned roles, decision constraints, marginal context shifts, channel adjustments, scoring procedures.
I dislike impressionistic testing and benchmarking. For that reason, I've decided to build measurement instruments that are more systematic, comparable, scoreable and replicable.
2.
Societal-level examination of frontier AI
How deployment alters incentive schemas, coordination dynamics and institutional equilibria across preexisting human systems.
I sincerely believe that innovation should be interpreted beyond capability gains, with attention to underlying implications and trade-offs.
3.
Legal reasoning evaluation in LLMs
Factual causation under theory constraints, normative confounders, theory-bound case reconstruction, and adversarial deliberation with independent judges.
After having worked in LegalTech, I've experienced first-hand how difficult it is to introduce LLMs for actual tasks, beyond summarizing documents. Hence, I believe that the most important pillar to study and hopefully crack is legal reasoning.
4.
Interactive dynamics in generative systems
Behavioral evaluation of multi-agent societies composed of LLMs that interact with each other, under structured experimental conditions and subject to the effects of interaction.
Looking at singular agents is important, of course. But what about singular agents that get influenced by other agents? And what about the global effects that arise in aggregates? Can the probabilistic nature of LLMs help recreate human variance?
The 4th track (“Interactive dynamics in generative systems”) will be inactive until 2027, as it's not prioritized right now, and requires significant advancements from the 1st track (“Psychometrics applied to LLMs”).
Other research
I'm contributing in a secondary manner to a project led by my advisor (Mariano Beiró) that consists of creating a live graphical interface to track humanitarian crises, provide meaningful information about events, and leverage context to allow effective clusterization methodologies. There, I'm not giving technical help, but rather International Relations knowledge, by building the theoretical framework that facilitates classification and bolsters the methodology.
Old interests
Before this current focus, the winding road toward my undergraduate thesis put me through other themes: maritime governance, high-seas sustainability, policy analysis, common-pool resources, China studies. I don't really integrate these into the current research tracks anymore.