如何创建 Azure AI 搜索

问题描述 投票:0回答:1

这是我到目前为止的代码:

vector_search=VectorSearch(
algorithms=[
    HnswAlgorithmConfiguration(
        name="myHnsw",
        kind="hnsw",
        parameters={
            "m": 4,
            "efConstruction":400,
            "efSearch":500,
            "metric":"cosine"
        }
    )
],
profiles=[
    VectorSearchProfile(
        name="myHnswProfile",
        algorithm_configuration_name="myHnsw",
        vectorizer="myVectorizer"
    )
],
vectorizers=[
    AzureOpenAIVectorizer(
        name="myVectorizer",
        azure_open_ai_parameters=AzureOpenAIParameters(
            resource_uri=azure_openai_endpoint,
            deployment_id=azure_openai_embedding_deployment,
            #model_name=embedding_model_name,
            api_key=azure_openai_key
        )
    )
]
)

请注意,模型在矢量化器中被注释掉,或者我收到模型属性不存在的错误

然后创建搜索索引为:

# Define the index fields
client = SearchIndexClient(endpoint, credential)
fields = [
  SimpleField(name="id", type=SearchFieldDataType.String, key=True, sortable=True, 
                   filterable=True, facetable=True),
  SimpleField(name="originalbalancedue", type=SearchFieldDataType.Double, 
                   sortable=True, filterable=True, facetable=True),
  SimpleField(name="adjustedbalancedue", type=SearchFieldDataType.Double, 
                   sortable=True, filterable=True, facetable=True),
  SimpleField(name="feeamount", type=SearchFieldDataType.Double, sortable=True, 
                   filterable=True, facetable=True),
  SearchableField(name="result", type=SearchFieldDataType.String, sortable=True, 
                   filterable=True, vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile")
]

index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)
result = client.create_or_update_index(index)
print(f'{result.name} created')

搜索索引创建成功。

现在尝试将文本和嵌入插入到向量存储中:

with open('worthiness_with_result_small.json', 'r') as f:
    content = f.read().strip()
    if content:
        documents = json.loads(content)
        print(f"loaded {len(documents)} documents")
    else:
        print("The file is empty.")
search_client = SearchClient(endpoint=endpoint, index_name=index_name, 
credential=credential)
result = search_client.upload_documents(documents)
print(f"Uploaded {len(documents)} documents") 

我的 dataworthiness_with_result_small.json 文件中看起来像:

[{“id”:“425001”,“原始余额”:1684269.59,“调整后的余额”:1683369.59,“费用金额”:6659.1199999999998900,“结果”:“5759.1199999999998900”}]

文件已上传。

现在,我正在尝试以这种方式进行矢量搜索:

# Define the context and query
context = """
You are a bot to assist in finding information from bills that cause result to be the highest.
Result is the total money that we make off of the bill.
The main fields to look at when determining if a claim is worth working are originalbalancedue,adjustedbalancedue, result.
Users will ask if certain bills are worth it to work.
When they ask if it is worth it to work, analyze the existing bill data to see if 
bills  have higher results.
"""
query = context + " Using the bills provided, which bills worth working"
embedding = client.embeddings.create(input=query, model=embedding_model_name, 
               dimensions=azure_openai_embedding_dimensions).data[0].embedding



# Perform the vector search
results = search_client.search(
    search_text=None,
    vectors=[vector_query],
    select=["id", "result"]
)

# Process and print the results
for result in results:
    print(f"ID: {result['id']}, Revcodes: {result['result']}")")

现在在最后一部分(尝试矢量搜索):出现错误:

     10 query = context + " Using the claims provided, which codes denote claims worth working"
 ---> 11 embedding = client.embeddings.create(input=query, model=embedding_model_name, dimensions=azure_openai_embedding_dimensions).data[0].embedding
      15 # Create the vector query
      16 vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="revcodes,revcodeamounts,revcodecount,drgcode,primarydxcode,admitdxcode")

      AttributeError: 'SearchIndexClient' object has no attribute 'embeddings'

在这种情况下创建矢量搜索的正确方法是什么

请注意,模型在矢量化器中被注释掉(创建配置时)。现在,我不知道当模型名称没有在任何地方定义时如何使用它。

azure azure-ai-search
1个回答
0
投票

出现此错误的原因是您使用

SearchIndexClient
而不是
AzureOpenAI
来创建嵌入。您需要创建一个 Azure OpenAI 客户端,然后使用该客户端生成嵌入。

你的代码会是这样的:

from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
import json

openai_credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(openai_credential, "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_deployment=azure_openai_embedding_deployment,
    api_version=azure_openai_api_version,
    azure_endpoint=azure_openai_endpoint,
    api_key=azure_openai_key,
    azure_ad_token_provider=token_provider if not azure_openai_key else None
)

代码参考:https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/basic-vector-workflow/azure-search-vector-python-sample.ipynb

© www.soinside.com 2019 - 2024. All rights reserved.