Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

chigwell/structuredxtract

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
1 Commit
 
 
 
 
 
 

Repository files navigation

structuredxtract

PyPI version License: MIT Downloads LinkedIn

Extract structured information from unstructured text with pattern-matching precision.

A Python package that simplifies structured data extraction from plain text inputs using a language model with pattern-matching capabilities. Ideal for surveys, feedback analysis, and report generation where consistent, well-formatted outputs are required.


🚀 Features

  • Pattern-based extraction: Uses regex patterns to enforce structured output formats.
  • Flexible LLM integration: Works with default ChatLLM7 or any LangChain-compatible model.
  • No multimedia support: Focuses solely on text-based inputs for reliability.
  • Consistent formatting: Ensures responses match expected schemas (tables, summaries, key-value pairs).
  • Easy customization: Replace default LLM with OpenAI, Anthropic, Google, or any other LangChain model.

📦 Installation

pip install structuredxtract

🔧 Usage

Basic Usage (Default LLM7)

from structuredxtract import structuredxtract

user_input = """
Name: John Doe
Age: 30
Occupation: Software Engineer
"""

response = structuredxtract(user_input)
print(response)  # Structured output based on predefined patterns

Custom LLM Integration

Replace the default ChatLLM7 with your preferred model:

OpenAI

from langchain_openai import ChatOpenAI
from structuredxtract import structuredxtract

llm = ChatOpenAI()
response = structuredxtract(user_input, llm=llm)

Anthropic

from langchain_anthropic import ChatAnthropic
from structuredxtract import structuredxtract

llm = ChatAnthropic()
response = structuredxtract(user_input, llm=llm)

Google Vertex AI

from langchain_google_genai import ChatGoogleGenerativeAI
from structuredxtract import structuredxtract

llm = ChatGoogleGenerativeAI()
response = structuredxtract(user_input, llm=llm)

🔑 API Key

  • Default: Uses LLM7_API_KEY from environment variables.
  • Manual override: Pass via api_key parameter or set LLM7_API_KEY before importing.
    import os
    os.environ["LLM7_API_KEY"] = "your_api_key_here"

Get a free API key at LLM7 Token.


📜 Parameters

Parameter Type Description
user_input str Plain text input to extract structured data from.
api_key Optional[str] LLM7 API key (optional if using environment variable).
llm Optional[BaseChatModel] Custom LangChain LLM (e.g., ChatOpenAI, ChatAnthropic). Defaults to ChatLLM7.

📊 Output

Returns a List[str] of extracted data matching predefined patterns. Example:

[
    {"Name": "John Doe", "Age": "30", "Occupation": "Software Engineer"},
    {"Key1": "Value1", "Key2": "Value2"}
]

🔄 Rate Limits

  • LLM7 Free Tier: Sufficient for most use cases.
  • Custom API Key: For higher limits, pass via api_key or environment variable.

📝 License

MIT


📢 Support & Issues

For bugs or feature requests, open an issue on GitHub.


👤 Author

Eugene Evstafev 📧 hi@euegne.plus 🔗 GitHub: chigwell

Releases

No releases published

Packages

 
 
 

Contributors

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.