In today’s e-commerce landscape, product descriptions and SEO metadata are crucial to engaging customers and improving search visibility. At Wain.cr, a wine importer and online distributor, we started with only 180 products out of 786 having any meaningful product page descriptions, metadata, or proper categorization. To rectify this, we implemented a scalable solution using Anaconda Python, OpenAI’s API, and Azure's Bing Search API to generate full descriptions, optimize metadata, and categorize the entire product catalog.
This article outlines our journey in enriching Wain.cr's product catalog, from conceptualizing the approach to deploying Python scripts that leveraged AI to automate and scale the process. By the end, you’ll have a step-by-step guide, along with Python code examples, to DIY this process in your own business.
TABLE OF CONTENTS |
I. Phase 1: Product Title Optimization and Categorization |
Initially, Wain.cr's catalog had limited information—about 180 products had proper descriptions, SEO titles, and metadata. The rest were blank, hindering both customer experience and SEO performance. Our goal was to enrich the catalog in a way that provided real value to customers while improving search engine rankings.
Step 1: Defining the Strategy
The first task was to optimize the product titles and categories for 786 products. We began by gathering essential product details such as wine variety, region, country, and brand, all of which we could extract programmatically using a combination of OpenAI’s GPT model and Bing Search API.
Step 2: Python Script to Fetch Information
We used Python to automate the extraction of relevant product information. First, we created a script that would pull relevant data from both OpenAI and Bing Search API to match each product title with its correct metadata. The script split the process into:
This structured data allowed us to generate optimized titles, enriched product metadata, and better categorize each product.
Here’s an example of the Python code used for this phase:
import pandas as pd
import openai
import requests
import time
# Load your product catalog CSV
df = pd.read_csv('wain-cr-product-catalog-titles.csv', encoding='UTF-8')
# Set OpenAI API key
openai.api_key = 'your-openai-api-key'
# Set Bing Search API key
bing_api_key = 'your-bing-api-key'
# Function to search Bing API
def search_bing(product_title):
search_url = "https://api.bing.microsoft.com/v7.0/search"
headers = {"Ocp-Apim-Subscription-Key": bing_api_key}
params = {"q": product_title, "count": 1}
response = requests.get(search_url, headers=headers, params=params)
if response.status_code == 200:
search_results = response.json()
if 'webPages' in search_results:
return search_results['webPages']['value'][0]['snippet']
return "No relevant data found"
# Function to generate OpenAI enriched data
def generate_openai_enrichment(product_title):
prompt = f"""You are tasked with optimizing a product catalog for wine. Based on the product title: {product_title}, return a structured description including:
- Grape Varietals
- Alcohol Content
- Professional Ratings
- Tasting Notes (Appearance, Nose, Palate)
- Recommended Pairings."""
response = openai.Completion.create(
model="gpt-3.5-turbo",
prompt=prompt,
max_tokens=300,
temperature=0.5
)
return response['choices'][0]['text']
# Create a new DataFrame for storing results
output_data = []
for index, row in df.iterrows():
product_title = row['Title']
# Get Bing search result
search_result = search_bing(product_title)
# Get OpenAI enriched details
enriched_details = generate_openai_enrichment(product_title)
# Append results
output_data.append({
'Handle': row['Handle'],
'Title': product_title,
'Bing Search Result': search_result,
'Enriched Details': enriched_details
})
# Sleep to avoid hitting API rate limits
time.sleep(30)
# Save the enriched data to CSV
output_df = pd.DataFrame(output_data)
output_df.to_csv('enriched_product_catalog.csv', index=False, encoding='UTF-8')
Step 3: Processing and Organizing Data
After running the script, we used the enriched data to create accurate and SEO-optimized product titles and metadata. This not only improved search engine visibility but also provided valuable context for customers browsing the site.
Having optimized the product titles and categories, we moved on to generating detailed product descriptions for each wine in the full catalog (only 180 products out of 786 had any meaningful product descriptions showcasing any value to potential consumers -especially those not familiar with top rated / high priced wines). We needed descriptions that were:
Step 1: Defining the Prompts
We set in place a process I learnt from the AI Exchange's Prompting for AI Operations Certification Program which aims to get the BEST prompt for a task through testing and iterating with Evals. An evals process in the domain of AI consists of creating and running small evaluation tests to improve prompt performance, by clearly defining prompt alternatives, examples, and definining evaluation criteria to choose the prompt with optimal results. In this case, the evals process actually resulted in actually generating two different prompts which generated two versions of product descriptions:
Step 2: Python Script to Generate Descriptions
We used OpenAI to generate two types of descriptions for each product and a short summary for metadata. The script pulled product titles from a CSV file and generated descriptions in bulk.
import pandas as pd
import openai
import time
# Load your product catalog CSV
df = pd.read_csv('wain-cr-product-catalog-titles.csv', encoding='UTF-8')
# Set OpenAI API key
openai.api_key = 'your-openai-api-key'
# Define the prompts
prompt_1 = """
You are tasked with enriching and optimizing a product catalog for wine. Based on the wine title '{product_title}', generate an enriched, structured description:
- Grape Varietals
- Alcohol Content
- Professional Ratings
- Tasting Notes (Appearance, Nose, Palate)
- Recommended Pairings
"""
prompt_2 = """
You are an AI copywriter for a high-end e-commerce wine store. Based on the wine title '{product_title}', write a persuasive, conversational description that speaks directly to the customer, highlighting:
- Unique characteristics of the wine
- Region and winery history
- Food pairings
- Notable awards or ratings
End with a subtle call to action encouraging purchase.
"""
prompt_summary = """
Summarize the following product description into a 145-character short description:
'{conversational_description}'
"""
# Function to generate OpenAI response
def generate_gpt_response(prompt):
response = openai.Completion.create(
model="gpt-3.5-turbo",
prompt=prompt,
max_tokens=1000,
temperature=0.4
)
return response['choices'][0]['text']
# Initialize an empty list to store output data
output_data = []
# Loop through each product
for index, row in df.iterrows():
product_title = row['Title']
# Generate structured details
prompt_1_filled = prompt_1.format(product_title=product_title)
structured_details = generate_gpt_response(prompt_1_filled)
# Generate conversational description
prompt_2_filled = prompt_2.format(product_title=product_title)
conversational_description = generate_gpt_response(prompt_2_filled)
# Generate short description
prompt_summary_filled = prompt_summary.format(conversational_description=conversational_description)
short_description = generate_gpt_response(prompt_summary_filled)
# Store outputs
output_data.append({
'Handle': row['Handle'],
'Title': product_title,
'Structured Details': structured_details,
'Conversational Description': conversational_description,
'Short Description': short_description
})
# Add a delay to avoid rate limits
time.sleep(30)
# Convert the data to DataFrame and save as CSV
output_df = pd.DataFrame(output_data)
output_df.to_csv('enriched_wain_cr_descriptions.csv', index=False, encoding='UTF-8')
Step 3: Results
Rich content and categorization on product pages now also allows Wain.cr to develop more accurate applications such as RAG chatbots, predictive and personalization models based on matching customer preferences, as well as richer BI reporting over visitation and purchase events happening across their ecommerce product catalog.
Optimizing an ecommerce catalog at scale using AI is hands-down a winning move! That said though, a Word of Caution: The evals process proved that the LLM model used [gpt-3.5-turbo
] resulted in accurate results due to the nature of the product category (wines), but this might not be the case for all product categories. In some (if not most) cases, a Retreival Augmented Generation (RAG) process might be best advised, where you would firstly encode data you already have at hand about your product catalog, and then request the LLM to paraphrase, categorize, structure and/or enrich it for you. This will provide results grounded on veritable data related to your actual product catalog and eliminate any potential hallucinations on the results.
In just two phases, we transformed Wain.cr's product catalog from 180 incomplete listings to 786 enriched and optimized products. This not only improved SEO rankings but also enhanced the customer experience, driving more informed purchasing decisions.
If you're looking to implement something similar for your business, feel free to follow the steps outlined in this article and use the Python code provided. Should you need assistance, don’t hesitate to reach out to me—I'd be happy to help you scale your content optimization efforts.