Document Oracle

This is a LoRA model for OpenLM-Research's Open LLaMA 7B model trained on our specifically-assembled webGPTxDolly extractive Q&A dataset using the Alpaca instruction format.

The included tokenizer is based on that of the baseline model, however the BOS, EOS, and UNK/PAD tokens are distinctly defined, which was not the case with the baseline.

The architecture of this LoRA model follows that of the LLaMA-7b Alpaca-LoRA with the hyper-parameters:

LORA_R = 8
LORA_ALPHA = 16
LORA_DROPOUT= 0.05
LORA_TARGET_MODULES = [
    "q_proj",
    "k_proj",
    "v_proj",
    "o_proj",
]

The model was trained using PEFT for up to 3 epochs, with load_best_model_at_end=True set. The learning rate was set to 5e-5, so the minimal validation loss occurred very near to the end of training.

Both the combined model and adapter weights are available.

The combined model can be loaded and used right out of the box:

BASE_MODEL = "starfishmedical/SFDocumentOracle-open_llama_7b_700bt_lora"

model = LlamaForCausalLM.from_pretrained(
    BASE_MODEL,
    torch_dtype=torch.float16,
    device_map="sequential"
)
tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL)

The adapter can be recombined with the baseline model to generate text:

BASE_MODEL = "openlm-research/open_llama_7b_700bt_preview"

bmodel = LlamaForCausalLM.from_pretrained(
    BASE_MODEL,
    torch_dtype=torch.float16,
    device_map="sequential"
)

peft_model_id = "starfishmedical/SFDocumentOracle-open_llama_7b_700bt_lora"
tokenizer = LlamaTokenizer.from_pretrained(peft_model_id)

model = PeftModel.from_pretrained(bmodel, peft_model_id)
model = model.to("cuda:0")
model.eval()

text_tmp = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{context}

### Response:
"""

Qs = [
    "What is Beyonce's full name?", #should be ez but who knows
    "What girl group was Beyonce a member of?", #Can look up, uses specific terminology
    "What is the name of Beyonce's first album?", #Can look up, requires "debut"->"first"
    "When did Beyonce's group disband?" #tougher --requires referential knowledge
]
#First para of Beyonce's wikipedia article
conte = "Beyoncé Giselle Knowles-Carter (née Knowles; born September 4, 1981)[5] is an American singer, songwriter and dancer. Regarded as one of the most successful performers of her generation, she is known for her boundary-pushing artistry and vocal abilities.[6][7] Her success earned her the nickname \"Queen Bey\" and led her to become a pop icon of the 21st century.[8][9]  Beyoncé started performing in various singing and dancing competitions as a child. She rose to fame in the late 1990s as a member of the R&B girl group Destiny's Child, one of the best-selling girl groups of all time. Their hiatus saw the release of her debut album Dangerously in Love (2003), which featured the US Billboard Hot 100 number-one singles \"Crazy in Love\" and \"Baby Boy.\" Following the 2006 disbanding of Destiny's Child, Beyoncé released a series of successful solo albums, including B'Day (2006), I Am... Sasha Fierce (2008), 4 (2011), Beyoncé (2013), Lemonade (2016), and Renaissance (2022), and became the first solo artist to have their first seven studio albums debut at number one on the Billboard 200.[10][11]  After professionally splitting from her manager and father, Mathew Knowles, in 2010, Beyoncé's artistry achieved wider critical acclaim for releasing sonically experimental visual albums and exploring societal themes such as infidelity, feminism, womanism, escapism, and hedonism. In 2018, she released Everything Is Love, a collaborative album with her husband, Jay-Z, as the Carters. In 2020, she released the musical film Black Is King, inspired by the music of the film soundtrack The Lion King: The Gift (2019), with praise from critics. Beyoncé also starred in multiple films such as Austin Powers in Goldmember (2002), The Pink Panther (2006), Dreamgirls (2006), Cadillac Records (2008), Obsessed (2009), and The Lion King (2019).  Having sold 200 million records worldwide, Beyoncé is one of the world's best-selling recording artists.[12][13] Throughout her career, she has amassed multiple chart-topping singles, including \"Check on It,\" \"Irreplaceable,\" \"Single Ladies,\" and \"Break My Soul.\" As a featured artist, Beyoncé topped the Billboard Hot 100 with the remixes of \"Perfect\" by Ed Sheeran and \"Savage\" by Megan Thee Stallion. With 31 career top-ten singles on the Billboard Hot 100, she became the first female artist and third overall to achieve at least twenty as a solo artist and ten as a group member.[14]  Beyoncé's accolades include 32 Grammy Awards, 26 MTV Video Music Awards (including the Michael Jackson Video Vanguard Award in 2014), 24 NAACP Image Awards, 31 BET Awards, and 17 Soul Train Music Awards, all of which are more than any other artist. Her success during the 2000s earned her recognition as the Recording Industry Association of America (RIAA)'s Top Certified Artist of the Decade and Billboard's Top Female Artist of the Decade.[15] In 2014, Billboard named her the highest-earning black musician of all time. In 2020, Time magazine featured her among a list of 100 women who defined the last century."
for instr in Qs:
    #Construct the question-format and send it through the model
    inputt = text_tmp.format(instruction=instr, context=conte)
    input_ids = tokenizer(inputt, return_tensors="pt").input_ids.to("cuda:0")
    input_length = input_ids.shape[1]
    with torch.no_grad():
        outputs = model.generate(input_ids=input_ids, max_new_tokens=256)
        output = outputs[:, input_length:]
        print(f"Q: {instr}")
        print(tokenizer.decode(output[0], skip_special_tokens=True).strip())
        print("\n")

to produce the output:

Q: What is Beyonce's full name?
Beyoncé Giselle Knowles-Carter


Q: What girl group was Beyonce a member of?
Beyonce was a member of the R&B girl group Destiny's Child.


Q: What is the name of Beyonce's first album?
Dangerously in Love


Q: When did Beyonce's group disband?
The group disbanded in 2006
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train starfishmedical/SFDocumentOracle-open_llama_7b_700bt_lora