CodeFuse

CodeFuse CodeFuse��еĴ��ר��ģ�ͣ��ݿ��ߵ��룬��Զ��ɴ��롢�Զ��ע�͡��Զ��ɲ��޸��Ż��ȣ��з�Ч�ʡ��û��ǳ�ѧ�߻��о��Ŀ��ߣ�CodeFuse ��ܹ��߱��Ч�ʺ�׼ȷ�ԡ�

AI��

�� CodeFuse

�Ķ�� 5229 �ղ�

CodeFuse ��

��ϼ��ڸոս��2023��̲��Ͽ�Դ�˴��ģ��CodeFuse��Ŀǰ��ħ��ء��顣

https://modelscope.cn/studios/codefuse-ai/CodeFuse-CodeLlama34B-MFT-Demo/summary

https://modelscope.cn/models/codefuse-ai/CodeFuse-13B/summary

CodeFuse��ϼ��еĴ��ģ�ͣ��ṩ��ܽ��ʵʱ֧�֣��Զ��ɴ��롢ע�͡��ȣ��з�Ч�ʡ��У�CodeFuse�ĵ÷ֳ��GPT-4��WizardCoder-34B��Դ��ݰ��ܺ�ģ�͡��֧�ֶ��΢��ɡ��롢��ɵ��

ģ�Ͱ�� CodeFuse13B-4K �� CodeFuse-CodeLlaMa34B-MFT��CodeFuse��6�¿�ʼ�ڲ⣬��ڿ��֡�IDE��Ӧ�ó��

ģ��

CodeFuse-CodeLlaMa34B-MFT�Ѿ��ħ��ռ䣬��ǿ��ڴ��ռ�ֱ��ģ�͵Ĵ��Ч��

��ռ��ӣ�

https://modelscope.cn/studios/codefuse-ai/CodeFuse-CodeLlama34B-MFT-Demo/summary

ģ��Ӽ��

CodeFuseϵ��ģ��ModelScope��Դ��

CodeFuse-13Bģ�ͣ�

https://modelscope.cn/models/codefuse-ai/CodeFuse-13B/summary

from modelscope.hub.snapshot_download import snapshot_download

model_dir = snapshot_download('codefuse-ai/CodeFuse-13B', revision='v1.0.0')

CodeFuse-CodeLlama-34Bģ�ͣ�

https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary

from modelscope.hub.snapshot_download import snapshot_download

model_dir = snapshot_download('codefuse-ai/CodeFuse-CodeLlama-34B', revision='v1.0.0')

ģ��

CodeFuse-13B��

import torch

from modelscope import AutoModelForCausalLM, AutoTokenizer, snapshot_download

model_dir = snapshot_download('codefuse-ai/CodeFuse-13B', revision='v1.0.0')

tokenizer = AutoTokenizer.from_pretrained(model_dir)

model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", torch_dtype=torch.float16).eval()

input_ids = tokenizer.encode("# language: Python\ndef quick_sort(array):\n", return_tensors="pt").to("cuda")

output_ids = model.generate(input_ids, max_new_tokens=200)

print(tokenizer.decode(output_ids[0]))

"""Out[0]

# language: Python

def quick_sort(array):

if len(array) <= 1:

return array

else:

pivot = array[0]

less_than_pivot = [i for i in array[1:] if i <= pivot]

greater_than_pivot = [i for i in array[1:] if i > pivot]

return quick_sort(less_than_pivot) + [pivot] + quick_sort(greater_than_pivot)

# Test the function

print(quick_sort([3,6,8,10,1,2,1]))<|endoftext|>

"""

CodeFuse-CodeLlama-34B��

import torch

from modelscope import AutoTokenizer, AutoModelForCausalLM, snapshot_download

model_dir = snapshot_download('codefuse-ai/CodeFuse-CodeLlama-34B', revision='v1.0.0')

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True, use_fast=False, legacy=False)

tokenizer.padding_side = "left"

tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids("<unk>")

tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids("</s>")

model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True,

device_map='auto',

torch_dtype=torch.bfloat16)

HUMAN_ROLE_START_TAG = "<|role_start|>human<|role_end|>"

BOT_ROLE_START_TAG = "<|role_start|>bot<|role_end|>"

text = f"{HUMAN_ROLE_START_TAG}write a python function of quick sort.{BOT_ROLE_START_TAG}"

inputs = tokenizer(text, return_tensors='pt', padding=True, add_special_tokens=False).to("cuda")

outputs = model.generate(

inputs=inputs["input_ids"],

attention_mask=inputs["attention_mask"],

max_new_tokens=512,

top_p=0.95,

temperature=0.1,

do_sample=True,

eos_token_id=tokenizer.eos_token_id,

pad_token_id=tokenizer.pad_token_id

)

gen_text = tokenizer.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)

print(gen_text[0])

"""Out[0]

Here is a Python function for quick sort:

```python

def quick_sort(arr):

if len(arr) <= 1:

return arr

else:

pivot = arr[0]

less = [i for i in arr[1:] if i <= pivot]

greater = [i for i in arr[1:] if i > pivot]

return quick_sort(less) + [pivot] + quick_sort(greater)

```

This function works by selecting the first element of the array as the pivot, and then partitioning the rest of the array into two parts: one with elements less than the pivot, and one with elements greater than the pivot. It then recursively sorts the two parts, and concatenates them with the pivot in the middle. It continues this process until the array is sorted.

Please note that this is a simple implementation of quick sort and may not be the most efficient for large lists. For large lists, a more complex version of quick sort that uses a partition function and swaps elements in place would be more efficient.

"""

��ݼ��Դ

ͬʱ��CodeFuse��Ŀ��Դ��ݼ��CodeExercise-Python-27k��Evol-instruction-66k��

CodeExercise-Python-27k��2.7��Python��ϰ�⣨Ӣ�ģ��ɣ��ǻ��﷨��ݽṹ��㷨Ӧ�á��ݿ��ѯ��ѧϰ��ٸ�Python��֪ʶ�㡣

CodeExercise-Python-27k ��ݼ��ӣ�https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary

Evol-instruction-66k�Ǹ��ġ�WizardCoder: Empowering Code Large Language Models with Evol-Instruct��ᵽ�ķ��ͨ��Ӹ��ӵĴ��ָ��ǿԤѵ��ģ�͵�΢��Ч�� ڿ�Դ��ݼ�Evol-Instruct-Code-80k-v1��϶��ݽ��һϵ�д��ˡ�HumanEval��ݹ��˵ȣ��ԭʼ80k��ɸѡ��õ�66k��ѵ��΢��ݡ�

Evol-instruction-66k��ݼ��ӣ�https://modelscope.cn/datasets/

CodeFuse ���

CodeFuse ��