New Supermicro Server Supercharges School of Business Research
Nvidia H100 GPU server from Supermicro will provide significant boost in AI and fintech research
In a 1956 edition of Stevens: A College for Engineers, the publication highlighted advanced work opportunities that allowed students to take the theories learned in the classroom and apply them through laboratory work. At the time, Stevens boasted access to an “ELECTRONIC COMPUTER, with the capacity sixty percent greater than any of its kind,” that allowed students to assist with their research and designs.
Nearly 70 years later, students, along with faculty and corporate partners, will now have access to a Nvidia H100 GPU server from Supermicro through the Hanlon Financial Systems Center in the Stevens School of Business to enhance their own work in the areas of artificial intelligence and financial technologies.
Access to this powerful new server, will allow greater experimentation that wasn’t possible before, allowing faculty and students to develop breakthrough research at a much faster pace and keep the School of Business at the forefront of producing leading-edge research. The speed and new capabilities will also allow corporate partners to scale up their work. The addition of this unit comes after the School of Business added its first H100 server more than a year ago.
AI use case benefits
The Nvidia H100 GPUs, built on the Hopper architecture, provide exceptional computational power specifically designed for AI tasks. This results in faster model training and inference, which is crucial for developing and deploying complex AI models. The inclusion of Tensor Cores in the H100 GPUs optimizes deep learning operations by significantly accelerating matrix computations and improving throughput for neural networks. Additionally, these GPUs support mixed precision (FP16/FP32) computing, which enhances computation speed while maintaining accuracy, a key factor in training large AI models. Furthermore, the server integrates software optimizations and AI frameworks, such as CUDA, cuDNN and TensorRT, which are tailored for Nvidia GPUs, thereby enhancing both performance and ease of development.
Finance use case benefits
The server's high-performance capabilities ensure low latency, which is crucial for high-frequency trading algorithms that need to process and react to market data in microseconds. The H100 GPUs can handle large datasets and complex calculations required for risk analysis, portfolio optimization and predictive modeling, delivering results faster than traditional CPU-based systems. Financial models often involve Monte Carlo simulations and other computationally intensive methods, which benefit from the parallel processing power of multiple GPUs. For tasks like sentiment analysis, market prediction and anomaly detection, the server's AI capabilities enable more accurate and faster model training and inference. Additionally, financial research often involves analyzing large volumes of structured and unstructured data and the H100 server's ability to process big data efficiently aids in uncovering insights and trends.
Stevens H 100-Projects: Past and Present
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
Authors: Yangyang Yu, Haohang Li, Zhi Chen, Yuecheng Jiang, Yang Li, Denghui Zhang, Rong Liu, Jordan W. Suchow and Khaldoun Khashanah
In this project, the team constructed the FinMem framework, a state-of-the-art LLM-based autonomous trading agent that outperforms its reinforcement learning counterparts. The framework encompasses three core modules: Profiling, which customizes the agent’s characteristics; Memory, which uses layered message processing to help the agent assimilate hierarchical financial data; and Decision-making, which converts insights gained from memories into investment decisions. Notably, FinMem’s memory module closely aligns with the cognitive structure of human traders, offering robust interpretability and real-time tuning. Its adjustable cognitive span allows for the retention of critical information beyond human perceptual limits, thereby enhancing trading outcomes. This framework enables the agent to self-evolve its professional knowledge, react agilely to new investment cues, and continuously refine trading decisions in the volatile financial environment. The framework utilizes large language models (LLMs) to power individual agents. With a novel memory database design, these trading agents outperform existing reinforcement learning counterparts. The use of H100 GPUs provides a highly efficient inference endpoint, supporting the main experiments.
Learn more: Conference Publication - Large Language Model (LLM) Agents@ICLR 2024 and Journal Publication - IEEE Transaction on Big Data (under review, minor revision)
The FinBen: A Holistic Financial Benchmark for Large Language Models
Authors: Yangyang Yu, Haohang Li, Yuechen Jiang et al.
LLMs have transformed natural language processing and shown promise in various fields, yet their potential in finance remains underexplored due to a lack of thorough evaluations and the complexity of financial tasks. This, along with the rapid development of LLMs, highlights the urgent need for a systematic financial evaluation benchmark for LLMs. In response to this need, FinBen is introduced as the first comprehensive open-sourced evaluation benchmark specifically designed to thoroughly assess the capabilities of LLMs in the financial domain. FinBen encompasses 35 datasets across 23 financial tasks, organized into three spectrums of difficulty inspired by the Cattell-Horn-Carroll theory, to evaluate LLMs’ cognitive abilities in inductive reasoning, associative memory, quantitative reasoning, crystallized intelligence, and more. The evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals insights into their strengths and limitations within the financial domain. This paper proposes a new benchmark framework for evaluating lLLMs within a financial context. In collaboration with researchers from Yale, Stony Brook, NYU, and other institutions, the use of LLMs for named entity recognition, classification, and trading tasks was investigated. Numerous models were deployed on H100 GPUs to support these diverse tasks.
Let Clickstream Talk: A Graph Neural Network Approach to Sales Forecasting
Authors: Rong Liu, Haohang Li, Zihan Chen, Xuying Zhao and Mai Feng
In the project, a new model capable of using click-graph features was proposed. With the novel design of a burst detection module and shop journey embedding, the model outperforms the state-of-the-art time series model for sales prediction and is able to detect and follow short-term sale trends without additional information. The project involved designing a time series deep learning model that uses a click graph as its input feature structure. During the experiments, the H100 GPU was utilized to scale the number of concurrent training iterations, reducing the time needed to benchmark the model by 5 to 10 times.
Learn more: Journal Publication - Production and Operations Management (under review)
Actively Learning a Bayesian Matrix Fusion Model with Deep Side Information
Authors: Yangyang Yu and Jordan Suchow
High-dimensional deep neural network representations of images and concepts can be aligned to predict human annotations of diverse stimuli. However, such alignment requires the costly collection of behavioral responses, resulting in deep-feature spaces being only sparsely sampled in practice. An active learning approach is proposed to adaptively sample experimental stimuli to efficiently learn a Bayesian matrix factorization model with deep-side information, demonstrating a significant efficiency gain over a passive baseline. Furthermore, with a sequential batched sampling strategy, the algorithm is applicable not only to small datasets collected from traditional laboratory experiments but also to settings where large-scale crowdsourced data collection is needed to accurately align the high-dimensional deep feature representations derived from pre-trained networks. This provides cost-effective solutions for collecting behavioral data and generating high-quality predictions in large-scale behavioral and cognitive studies.
Learn more: Conference Publication - COGSCI
Control in Stochastic Environment with Delays: A Model-based Reinforcement
Authors: Zhiyuan Yao, Ionut Florescu and Chihoon Lee
In this paper, a new reinforcement learning method for control problems in environments with delayed feedback is introduced. Specifically, the method employs stochastic planning, unlike previous methods that used deterministic planning. This approach allows for the embedding of risk preference in the policy optimization problem. It is demonstrated that this formulation can recover the optimal policy for problems with deterministic transitions. The new policy is contrasted with two prior methods from the literature. The methodology is first applied to simple tasks to understand its features. Subsequently, the performance of the methods is compared in controlling multiple Atari games.
Learn more: Conference Publication - The 34th International Conference on Automated Planning and Scheduling
Bi-channel Multi-modal Object Matching with VLM-powered Open-ended Concepts Generation (BiOCGen)
In this project, the plan is to pretrain the BLIP 2 model to enable the vision-language model to generate complementary fashion product recommendations. The H100 GPU will serve as the hardware for hosting the model during training and inference.