Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Latest commit

 

History

History
History
31 lines (18 loc) · 755 Bytes

File metadata and controls

31 lines (18 loc) · 755 Bytes
Copy raw file
Download raw file
Open symbols panel
Edit and raw actions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
"""llama-cpp-python server from scratch in a single file.
"""
# import llama_cpp
# path = b"../../models/Qwen1.5-0.5B-Chat-GGUF/qwen1_5-0_5b-chat-q8_0.gguf"
# model_params = llama_cpp.llama_model_default_params()
# model = llama_cpp.llama_load_model_from_file(path, model_params)
# if model is None:
# raise RuntimeError(f"Failed to load model from file: {path}")
# ctx_params = llama_cpp.llama_context_default_params()
# ctx = llama_cpp.llama_new_context_with_model(model, ctx_params)
# if ctx is None:
# raise RuntimeError("Failed to create context")
from fastapi import FastAPI
app = FastAPI()
import openai.types.chat as types
@app.post("/v1/chat/completions")
def create_chat_completions():
return {"message": "Hello World"}
Morty Proxy This is a proxified and sanitized view of the page, visit original site.