Spaces:

technicolor
/

InteractiveSurvey

Sleeping

App Files Files Community

InteractiveSurvey / src /demo /asg_conclusion.py

technicolor

Add Django InteractiveSurvey project

a97d040 6 days ago

raw

history blame contribute delete

15 kB

	import os


	class ConclusionGenerator:
	def __init__(self, pipeline):
	self.pipeline = pipeline

	def generate(self, title, intro, mode='lora'):
	if mode == 'lora' or mode == 'test':
	if mode == 'lora':
	self.pipeline.model.set_adapter("conclusion")

	system_prompt = f'''You are a helpful assistant that help to generate the conclusion of the survey paper given the survey title and survey introduction.'''
	# user_prompt = {"survey_title":survey_title, "claims":cluster_with_claims}
	user_prompt = f'''Help me to generate the conclusion of a survey paper given the title: {title}, and and the introduction:{intro}'''

	messages = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_prompt},
	{"role": "assistant", "content":"Conclusion: This survey "}
	]

	outputs = self.pipeline(
	messages,
	max_new_tokens=4096,
	)
	result = outputs[0]["generated_text"][-1]['content']
	return result
	else:
	raise ValueError('mode not supported')

	if __name__ == '__main__':
	from transformers import pipeline
	import torch
	import transformers

	model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
	Global_pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16},
	token = os.getenv('HF_API_KEY'),
	device_map="auto",
	)
	Global_pipeline.model.load_adapter(peft_model_id = "technicolor/llama3.1_8b_outline_generation", adapter_name="outline")
	Global_pipeline.model.load_adapter(peft_model_id ="technicolor/llama3.1_8b_conclusion_generation", adapter_name="conclusion")
	Global_pipeline.model.load_adapter(peft_model_id ="technicolor/llama3.1_8b_abstract_generation", adapter_name="abstract")


	title = "A Survey of Large Language Models"
	intro = '''L
	ANGUAGE is a prominent ability in human beings to
	express and communicate, which develops in early
	childhood and evolves over a lifetime [3, 4]. Machines,
	however, cannot naturally grasp the abilities of understanding and communicating in the form of human language,
	unless equipped with powerful artificial intelligence (AI)
	algorithms. It has been a longstanding research challenge
	to achieve this goal, to enable machines to read, write, and
	communicate like humans [5].
	Technically, language modeling (LM) is one of the major
	approaches to advancing language intelligence of machines.
	In general, LM aims to model the generative likelihood
	of word sequences, so as to predict the probabilities of
	future (or missing) tokens. The research of LM has received
	extensive attention in the literature, which can be divided
	into four major development stages:
	• Statistical language models (SLM). SLMs [6–9] are developed based on statistical learning methods that rose in
	the 1990s. The basic idea is to build the word prediction
	model based on the Markov assumption, e.g., predicting the
	next word based on the most recent context. The SLMs with
	a fixed context length n are also called n-gram language
	models, e.g., bigram and trigram language models. SLMs
	have been widely applied to enhance task performance
	in information retrieval (IR) [10, 11] and natural language
	processing (NLP) [12–14]. However, they often suffer from
	the curse of dimensionality: it is difficult to accurately
	estimate high-order language models since an exponential
	number of transition probabilities need to be estimated.
	Thus, specially designed smoothing strategies such as backoff estimation [15] and Good–Turing estimation [16] have
	been introduced to alleviate the data sparsity problem.
	• Neural language models (NLM). NLMs [1, 17, 18] characterize the probability of word sequences by neural networks,
	e.g., multi-layer perceptron (MLP) and recurrent neural networks (RNNs). As a remarkable contribution, the work in
	[1] introduced the concept of distributed representation of
	words and built the word prediction function conditioned
	on the aggregated context features (i.e., the distributed
	word vectors). By extending the idea of learning effective
	features for text data, a general neural network approach
	was developed to build a unified, end-to-end solution for
	various NLP tasks [2]. Furthermore, word2vec [19, 20] was
	proposed to build a simplified shallow neural network
	for learning distributed word representations, which were
	demonstrated to be very effective across a variety of NLP
	tasks. These studies have initiated the use of language
	models for representation learning (beyond word sequence
	modeling), having an important impact on the field of NLP.
	• Pre-trained language models (PLM). As an early attempt, ELMo [21] was proposed to capture context-aware
	word representations by first pre-training a bidirectional
	LSTM (biLSTM) network (instead of learning fixed word
	representations) and then fine-tuning the biLSTM network
	according to specific downstream tasks. Furthermore, based
	on the highly parallelizable Transformer architecture [22]
	with self-attention mechanisms, BERT [23] was proposed by
	pre-training bidirectional language models with specially
	designed pre-training tasks on large-scale unlabeled corpora. These pre-trained context-aware word representations
	are very effective as general-purpose semantic features,
	which have largely raised the performance bar of NLP
	tasks. This study has inspired a large number of follow-up
	work, which sets the “pre-training and fine-tuning” learning
	paradigm. Following this paradigm, a great number of studies on PLMs have been developed, introducing either different architectures [24, 25] (e.g., GPT-2 [26] and BART [24]) or
	improved pre-training strategies [27–29]. In this paradigm, it
	often requires fine-tuning the PLM for adapting to different
	downstream tasks.
	• Large language models (LLM). Researchers find that
	scaling PLM (e.g., scaling model size or data size) often
	leads to an improved model capacity on downstream tasks
	(i.e., following the scaling law [30]). A number of studies
	have explored the performance limit by training an ever
	larger PLM (e.g., the 175B-parameter GPT-3 and the 540Bparameter PaLM). Although scaling is mainly conducted
	in model size (with similar architectures and pre-training
	tasks), these large-sized PLMs display different behaviors
	from smaller PLMs (e.g., 330M-parameter BERT and 1.5Bparameter GPT-2) and show surprising abilities (called emergent abilities [31]) in solving a series of complex tasks. For
	example, GPT-3 can solve few-shot tasks through in-context
	learning, whereas GPT-2 cannot do well. Thus, the research
	community coins the term “large language models (LLM)”
	1
	for these large-sized PLMs [32–35], which attract increasing
	research attention (See Figure 1). A remarkable application
	of LLMs is ChatGPT2
	that adapts the LLMs from the GPT
	series for dialogue, which presents an amazing conversation
	ability with humans. We can observe a sharp increase of the
	arXiv papers that are related to LLMs after the release of
	ChatGPT in Figure 1.
	As discussed before, language model is not a new technical concept specially for LLMs, but has evolved with the
	advance of artificial intelligence over the decades. Early language models mainly aim to model and generate text data,
	while latest language models (e.g., GPT-4) focus on complex
	task solving. From language modeling to task solving, it is an
	important leap in scientific thinking, which is the key to
	understand the development of language models in the research history. From the perspective of task solving, the four
	generations of language models have exhibited different levels of model capacities. In Figure 2, we describe the evolution process of language models in terms of the task solving
	capacity. At first, statistical language models mainly assisted
	in some specific tasks (e.g., retrieval or speech tasks), in
	which the predicted or estimated probabilities can enhance
	the performance of task-specific approaches. Subsequently,
	neural language models focused on learning task-agnostic
	representations (e.g., features), aiming to reduce the efforts
	for human feature engineering. Furthermore, pre-trained
	language models learned context-aware representations that
	can be optimized according to downstream tasks. For the
	latest generation of language model, LLMs are enhanced by
	exploring the scaling effect on model capacity, which can be
	considered as general-purpose task solvers. To summarize,
	in the evolution process, the task scope that can be solved
	by language models have been greatly extended, and the
	task performance attained by language models have been
	significantly enhanced.
	In the existing literature, PLMs have been widely discussed and surveyed [36–39], while LLMs are seldom reviewed in a systematic way. To motivate our survey, we first
	highlight three major differences between LLMs and PLMs.
	First, LLMs display some surprising emergent abilities that
	may not be observed in previous smaller PLMs. These abilities are key to the performance of language models on complex tasks, making AI algorithms unprecedently powerful
	and effective. Second, LLMs would revolutionize the way
	that humans develop and use AI algorithms. Unlike small
	PLMs, the major approach to accessing LLMs is through
	the prompting interface (e.g., GPT-4 API). Humans have to
	understand how LLMs work and format their tasks in a way
	that LLMs can follow. Third, the development of LLMs no
	longer draws a clear distinction between research and engineering. The training of LLMs requires extensive practical
	experiences in large-scale data processing and distributed
	parallel training. To develop capable LLMs, researchers
	have to solve complicated engineering issues, working with
	engineers or being engineers.
	Nowadays, LLMs are posing a significant impact on
	the AI community, and the advent of ChatGPT and GPT-4
	leads to the rethinking of the possibilities of artificial general
	intelligence (AGI). OpenAI has published a technical article
	entitled “Planning for AGI and beyond”, which discusses
	the short-term and long-term plans to approach AGI [40],
	and a more recent paper has argued that GPT-4 might be
	considered as an early version of an AGI system [41]. The
	research areas of AI are being revolutionized by the rapid
	progress of LLMs. In the field of NLP, LLMs can serve as a
	general-purpose language task solver (to some extent), and
	the research paradigm has been shifting towards the use
	of LLMs. In the field of IR, traditional search engines are
	challenged by the new information seeking way through AI
	chatbots (i.e., ChatGPT), and New Bing3 presents an initial
	attempt that enhances the search results based on LLMs. In
	the field of CV, the researchers try to develop ChatGPT-like
	vision-language models that can better serve multimodal
	dialogues [42–45], and GPT-4 [46] has supported multimodal input by integrating the visual information. This new
	wave of technology would potentially lead to a prosperous
	ecosystem of real-world applications based on LLMs. For
	instance, Microsoft 365 is being empowered by LLMs (i.e.,
	Copilot) to automate the office work, and OpenAI supports
	the use of plugins in ChatGPT for implementing special
	functions.
	Despite the progress and impact, the underlying principles of LLMs are still not well explored. Firstly, it is
	mysterious why emergent abilities occur in LLMs, instead of
	smaller PLMs. As a more general issue, there lacks a deep,
	detailed investigation of the key factors that contribute to
	the superior abilities of LLMs. It is important to study when
	and how LLMs obtain such abilities [47]. Although there are
	some meaningful discussions about this problem [31, 47],
	more principled investigations are needed to uncover the
	“secrets“ of LLMs. Secondly, it is difficult for the research
	community to train capable LLMs. Due to the huge demand of computation resources, it is very costly to carry
	out repetitive, ablating studies for investigating the effect
	of various strategies for training LLMs. Indeed, LLMs are
	mainly trained by industry, where many important training
	details (e.g., data collection and cleaning) are not revealed
	to the public. Thirdly, it is challenging to align LLMs with
	human values or preferences. Despite the capacities, LLMs
	are also likely to produce toxic, fictitious, or harmful contents. It requires effective and efficient control approaches
	to eliminating the potential risk of the use of LLMs [46].
	Faced with both opportunities and challenges, it needs
	more attention on the research and development of LLMs. In
	order to provide a basic understanding of LLMs, this survey
	conducts a literature review of the recent advances in LLMs
	from four major aspects, including pre-training (how to pretrain a capable LLM), adaptation (how to effectively adapt
	pre-trained LLMs for better use), utilization (how to use
	LLMs for solving various downstream tasks) and capability
	evaluation (how to evaluate the abilities of LLMs and existing
	empirical findings). We thoroughly comb the literature and
	summarize the key findings, techniques, and methods of
	LLMs. For this survey, we also create a GitHub project
	website by collecting the supporting resources for LLMs, at
	the link https://github.com/RUCAIBox/LLMSurvey. We
	are also aware of several related review articles on PLMs
	or LLMs [32, 36, 38, 39, 43, 48–54]. These papers either
	discuss PLMs or some specific (or general) aspects of LLMs.
	Compared with them, we focus on the techniques and
	methods to develop and use LLMs and provide a relatively
	comprehensive reference to important aspects of LLMs.
	The remainder of this survey is organized as follows:
	Section 2 introduces the background for LLMs and the evolution of GPT-series models, followed by the summarization
	of available resources for developing LLMs in Section 3.
	Sections 4, 5, 6, and 7 review and summarize the recent
	progress from the four aspects of pre-training, adaptation,
	utilization, and capacity evaluation, respectively. Then, Section 8 discusses the practical guide for prompt design,
	and Section 9 reviews the applications of LLMs in several
	representative domains. Finally, we conclude the survey in
	Section 10 by summarizing the major findings and discuss
	the remaining issues for future work.
	'''


	conclusion_generator = ConclusionGenerator(Global_pipeline)
	with_lora = conclusion_generator.generate(title, intro, mode='lora')
	# print("The conclusion generated with LORA is: \n", with_lora)
	# print("=============================================================")
	with_test = conclusion_generator.generate(title, intro, mode='test')
	# print("The conclusion generated with test is: \n", with_test)
	# print("=============================================================")