ðŸ"¦ Teeccino Analytics Database

teeccino_analytics.db â€" The single source of truth for all Teeccino ecommerce reporting

14
Tables
357 MB
Size
1.4M+
Total Rows
Jan 2023 â†' Present
Date Range

ðŸ"" How the Database Gets Updated

The analytics pipeline runs daily at 7:00 AM PST via run_pipeline.py. It can also be triggered manually for any date or as a full rebuild.

â"€â"€ Step 1: Ingestion â"€â"€
ingest_shopify.py â†' Shopify GraphQL Admin API â†' raw_orders + raw_line_items + raw_discount_allocations + order_journey + order_moments
ingest_meta.py â†' Daily Meta Ads CSV (or Meta API) â†' raw_meta_ads
ingest_google.py â†' Daily Google Ads CSV (or Google API) â†' raw_google_ads
ingest_shipstation.py ShipStation Shipments API raw_shipments

â"€â"€ Step 2: Transform â"€â"€
transform_daily.py â†' raw_orders + raw_line_items â†' calc_daily_shopify (43 daily aggregate columns)
transform_sku.py â†' raw_line_items + price_list + cogs â†' calc_daily_sku_sales (per-SKU daily metrics)
transform_attribution.py â†' raw_orders + order_journey â†' calc_attribution (first/last touch per order)
transform_cohorts.py â†' raw_orders â†' calc_customer_cohorts (full rebuild only â€" too expensive daily)

â"€â"€ Step 3: Output â"€â"€
build_all_matrices.py calc tables + raw tables 3 Excel workbooks (14 sheets)
build_mtd_dashboard.py calc_daily_shopify + raw_meta/google_ads mtd.html GitHub Pages
build_ytd_dashboard.py calc_daily_shopify + raw_meta/google_ads ytd.html GitHub Pages
Update Schedule: The daily pipeline runs at 7:00 AM PST, processing the previous day's data. A full rebuild (all dates since Jan 2023) takes ~15 minutes.

Commands:
python run_pipeline.py â€" Transform today's data
python run_pipeline.py --ingest â€" Ingest from APIs + transform
python run_pipeline.py --date 2026-04-01 â€" Process a specific date
python run_pipeline.py --full â€" Full rebuild of all calculated tables

ðŸ"¥ Raw Tables â€" Ingested Data

Source data pulled directly from APIs and CSVs. These are the ground truth â€" never modified by transforms.

raw_orders
Raw 147,846 rows Jan 2023 â†' Apr 2026 â- ̧
Every Shopify order, fetched via the GraphQL Admin API by ingest_shopify.py. Includes DTC, wholesale (Walmart/Target/Faire), and subscription orders. Excludes cancelled/voided. One row per order.

Source: Shopify GraphQL orders query â†' customer journey + financial data
ColumnTypeDescription
shopify_id PKTEXTShopify GID (e.g. gid://shopify/Order/5048552259832)
order_numberTEXTHuman-readable order number (e.g. #59925)
created_atTEXTISO 8601 timestamp when order was placed (UTC)
dateTEXTOrder date in PST (YYYY-MM-DD). Used as the canonical date for all reporting.
customer_idTEXTShopify customer GID
customer_emailTEXTCustomer email address
customer_order_indexINTEGERNth order for this customer (1 = new customer, 2+ = returning)
subtotalREALOrder subtotal before shipping/tax (after line-item discounts)
total_discountsREALTotal discount amount applied to the order
total_shippingREALShipping charged to customer
total_priceREALFinal amount collected (net revenue)
total_taxREALTax amount
tagsTEXTComma-separated Shopify tags (used for wholesale detection: Walmart, Target, Faire)
source_nameTEXTCheckout source (checkout_next, shopify_draft_order, etc.)
financial_statusTEXTPayment status (PAID, PARTIALLY_REFUNDED, REFUNDED)
is_subscriptionINTEGER1 if any line item has a selling plan (subscription)
is_first_subscriptionINTEGER1 if this is the customer's first subscription order
is_wholesaleINTEGER1 if tagged Walmart, Target, or Faire
wholesale_channelTEXTWhich wholesale channel (Walmart, Target, Faire, or null)
fetched_atTEXTWhen this row was last fetched/updated from Shopify
Indexes: shopify_id (unique PK), date
raw_line_items
Raw 505,739 rows Jan 2023 â†' Apr 2026 â- ̧
Individual line items within each order. One row per SKU per order. Used by transform_sku.py to calculate True Gross Sales (quantity Ã- price_list_price).

Source: Shopify GraphQL lineItems within each order
ColumnTypeDescription
id PKINTEGERAuto-increment ID
order_idTEXTFK â†' raw_orders.shopify_id
dateTEXTOrder date (denormalized for fast SKU queries)
skuTEXTProduct SKU (e.g. 43111, 80060L)
product_nameTEXTFull product name
variant_titleTEXTVariant (e.g. "25 tea bags", "11 ounce grind")
quantityINTEGERUnits purchased in this line item
original_unit_priceREALPrice per unit before discounts
total_priceREALTotal line item price after discounts
selling_plan_idTEXTShopify selling plan GID (non-null = subscription)
is_subscriptionINTEGER1 if purchased via subscription selling plan
is_wholesaleINTEGER1 if parent order is wholesale
Indexes: order_id, date, sku
raw_discount_allocations
Raw 169,542 rows â- ̧
Per-line-item discount allocations from Shopify. Shows exactly how much each discount code/automatic discount took off each line item. Critical for the Discount Matrix.

Source: Shopify GraphQL discountAllocations on each line item
ColumnTypeDescription
id PKINTEGERAuto-increment ID
order_idTEXTFK â†' raw_orders.shopify_id
line_item_indexINTEGERPosition of line item within the order (0-based)
skuTEXTSKU of the discounted item
amountREALDollar amount of this discount allocation
discount_typeTEXTType: discount_code, automatic, manual
discount_titleTEXTName of the discount (e.g. "NEWYEAR", "Free Gift with Purchase")
application_typenameTEXTShopify type: DiscountCodeApplication, AutomaticDiscountApplication, ManualDiscountApplication
Indexes: order_id
raw_meta_ads
Raw 1,153 rows Feb 2023 â†' Apr 2026 â- ̧
Daily Meta (Facebook/Instagram) advertising performance. One row per day.

Source: ingest_meta.py â†' reads Daily Meta Ads.csv (generated by daily_meta_report.py from Meta Marketing API, account 559530661221525)
ColumnTypeDescription
date PKTEXTDate (YYYY-MM-DD)
spendREALTotal ad spend ($)
impressionsINTEGERTotal impressions
clicksINTEGERLink clicks
reachINTEGERUnique users reached
purchasesINTEGERAttributed purchases (1-day click / 7-day click)
purchase_valueREALAttributed purchase revenue ($)
fetched_atTEXTWhen this row was last ingested
Indexes: date (unique PK)
raw_google_ads
Raw 1,180 rows Jan 2023 â†' Apr 2026 â- ̧
Daily Google Ads performance. One row per day.

Source: ingest_google.py â†' reads Daily Google Ads.csv (generated by daily_google_report.py from Google Ads API, customer ID 806-386-6750)
ColumnTypeDescription
date PKTEXTDate (YYYY-MM-DD)
spendREALTotal ad spend ($)
impressionsINTEGERTotal impressions
clicksINTEGERTotal clicks
conversionsREALAttributed conversions (fractional â€" Google uses data-driven attribution)
conversion_valueREALAttributed conversion revenue ($)
fetched_atTEXTWhen this row was last ingested
Indexes: date (unique PK)
raw_shipments
Raw 106,580 rows Jan 2024 → Apr 2026
Shipment records from ShipStation. One row per shipment (a single order may have multiple shipments). Contains actual carrier costs, carrier/service codes, destination info, and weights.

Source: ingest_shipstation.py → ShipStation REST API /shipments endpoint
ColumnTypeDescription
shipment_id PKTEXTShipStation shipment ID
order_idTEXTShipStation order ID
order_numberTEXTHuman-readable order number (matches Shopify order number)
ship_dateTEXTDate shipped (YYYY-MM-DD)
carrier_codeTEXTCarrier (dhl_global_mail, stamps_com, fedex)
service_codeTEXTService (e.g. dhl_parcel_ground, usps_priority_mail)
shipment_costREALActual carrier cost paid ($) — what we pay to ship
insurance_costREALInsurance cost ($)
weight_ozREALPackage weight in ounces
tracking_numberTEXTCarrier tracking number
to_stateTEXTDestination state
to_countryTEXTDestination country code
is_voidedINTEGER1 if voided/cancelled
is_returnINTEGER1 if return label
fetched_atTEXTWhen last fetched from API
Indexes: shipment_id (PK), ship_date, order_number

🧭 Journey Tables â€" Customer Attribution

Marketing touchpoint data from Shopify's customer journey tracking. Shows how customers discovered and returned to the store before purchasing.

order_journey
Journey 147,846 rows â- ̧
First-touch and last-touch attribution summary for each order. Extracted from Shopify's customerJourneySummary GraphQL field. One row per order.

Source: ingest_shopify.py â†' Shopify GraphQL customerJourneySummary
ColumnTypeDescription
order_id PKTEXTFK â†' raw_orders.shopify_id
customer_order_indexINTEGERCustomer's nth order
days_to_conversionINTEGERDays from first visit to purchase
first_visit_sourceTEXTFirst-touch channel (email, direct, google, facebook, etc.)
first_visit_source_typeTEXTSource type classification
first_visit_source_descriptionTEXTHuman-readable description from Shopify
first_visit_landing_pageTEXTFirst landing page URL
first_visit_referrer_urlTEXTReferrer URL for first visit
first_visit_referral_codeTEXTReferral code if applicable
first_visit_atTEXTISO timestamp of first visit
first_visit_utm_sourceTEXTUTM source parameter
first_visit_utm_mediumTEXTUTM medium parameter
first_visit_utm_campaignTEXTUTM campaign parameter
first_visit_utm_termTEXTUTM term parameter
first_visit_utm_contentTEXTUTM content parameter
last_visit_sourceTEXTLast-touch channel before conversion
last_visit_source_typeTEXTLast-touch source type
last_visit_source_descriptionTEXTDescription of last visit
last_visit_landing_pageTEXTLast landing page before conversion
last_visit_referrer_urlTEXTLast referrer URL
last_visit_referral_codeTEXTLast referral code
last_visit_atTEXTISO timestamp of last visit before purchase
last_visit_utm_*TEXTUTM parameters for last visit (source, medium, campaign, term, content)
moments_countINTEGERTotal touchpoints in the customer journey
journey_readyINTEGER1 if Shopify has finalized journey data for this order
Indexes: order_id (unique PK)
order_moments
Journey 265,028 rows â- ̧
Individual marketing touchpoints ("moments") in each customer's journey. Multiple rows per order â€" one per visit/interaction before purchase.

Source: ingest_shopify.py â†' Shopify GraphQL customerJourneySummary.moments
ColumnTypeDescription
id PKINTEGERAuto-increment ID
order_idTEXTFK â†' raw_orders.shopify_id
moment_indexINTEGERChronological index of this moment (0-based)
occurred_atTEXTISO timestamp when the touchpoint occurred
sourceTEXTChannel source (email, direct, google, facebook, etc.)
source_typeTEXTSource type classification
source_descriptionTEXTHuman-readable description
landing_pageTEXTURL the customer landed on
referrer_urlTEXTReferring URL
referral_codeTEXTReferral/affiliate code if present
utm_sourceTEXTUTM source parameter
utm_mediumTEXTUTM medium parameter
utm_campaignTEXTUTM campaign parameter
utm_termTEXTUTM term parameter
utm_contentTEXTUTM content parameter
Indexes: order_id

âšTMï ̧ Calculated Tables â€" Transformed Data

Derived from raw tables by transform scripts. These power the MTD dashboard and reports. Rebuilt daily (or on-demand).

calc_daily_shopify
Calculated 1,189 rows Jan 2023 â†' Apr 2026 â- ̧
Daily aggregate KPIs for Shopify. This is the primary table that powers the MTD dashboard. One row per day with 43 columns covering orders, revenue, discounts, subscriptions (with True Gross breakdown by first-time vs recurring), free items, and wholesale.

Source: transform_daily.py â†' aggregates raw_orders + raw_line_items + raw_discount_allocations
Gross formula: True Gross = Σ(quantity Ã- price_list_price) per line item, using monthly price list. Free items excluded — items with unit_price = $0 (BOGO/gifts) are tracked separately in dtc_free_item_value.
ColumnTypeDescription
date PKTEXTDate (YYYY-MM-DD)
â€" All-Channel Totals â€"
total_ordersINTEGERTotal orders (DTC + wholesale)
new_ordersINTEGERNew customer orders (customer_order_index = 1)
ret_ordersINTEGERReturning customer orders
total_revenueREALTotal net revenue (all channels)
new_revenueREALRevenue from new customers
ret_revenueREALRevenue from returning customers
â€" DTC (Direct-to-Consumer) â€"
dtc_ordersINTEGERDTC orders only (excludes Walmart/Target/Faire)
dtc_new_ordersINTEGERDTC new customer orders
dtc_ret_ordersINTEGERDTC returning customer orders
dtc_grossREALTrue Gross Sales = Σ(units Ã- monthly price list price) for DTC line items
dtc_revenueREALDTC net revenue (total_price from orders)
dtc_new_revenueREALDTC revenue from new customers
dtc_ret_revenueREALDTC revenue from returning customers
dtc_subtotalREALSum of order subtotals (after line discounts, before shipping)
â€" Discount Breakdown â€"
dtc_discountsREALTotal discount amount (all types combined)
dtc_subscription_discountsREALSubscription discount allocations
dtc_automatic_discountsREALAutomatic discount allocations
dtc_code_discountsREALDiscount code allocations
dtc_manual_discountsREALManual discount allocations
dtc_shipping_discountsREALShipping discount allocations
â€" Free Items â€"
dtc_free_item_valueREALRetail value of free items given away
dtc_free_item_countINTEGERNumber of free items
dtc_free_item_cogsREALCost of goods for free items
â€" Subscriptions â€"
sub_ordersINTEGERTotal subscription orders
sub_revenueREALSubscription revenue
first_sub_ordersINTEGERFirst-time subscription orders (new subscribers)
recurring_sub_ordersINTEGERRecurring subscription orders
sub_grossREALTrue Gross for subscription orders only (units × price list price, excl. free items)
sub_new_grossREALTrue Gross for first-time subscription orders
sub_new_revenueREALNet revenue from first-time subscription orders
sub_recurring_grossREALTrue Gross for recurring subscription orders
sub_recurring_revenueREALNet revenue from recurring subscription orders
sub_unitsINTEGERTotal units sold via subscription
â€" Wholesale â€"
ws_ordersINTEGERWholesale orders (Walmart + Target + Faire)
ws_grossREALWholesale gross sales
ws_revenueREALWholesale net revenue
â€" Totals â€"
subtotalREALSum of all order subtotals
total_discountsREALSum of all discounts
total_shippingREALSum of customer-paid shipping
total_revenue_allREALTotal revenue across all channels
total_unitsINTEGERTotal units sold
calculated_atTEXTTimestamp when this row was last calculated
Indexes: date (unique PK)
calc_daily_sku_sales
Calculated 139,051 rows Jan 2023 â†' Apr 2026 â- ̧
Daily sales by SKU. One row per SKU per day (only days with sales). Used for SKU profitability analysis and the SKU breakdown in matrices.

Source: transform_sku.py â†' aggregates raw_line_items + joins price_list + cogs
ColumnTypeDescription
date PKTEXTDate
sku PKTEXTProduct SKU
units_soldINTEGERTotal units sold (all channels)
gross_salesREALGross sales = units Ã- price list price
net_salesREALNet sales after discounts
discountsREALTotal discounts applied to this SKU
dtc_unitsINTEGERDTC units sold
dtc_grossREALDTC gross sales
dtc_netREALDTC net sales
dtc_discountsREALDTC discounts
ws_unitsINTEGERWholesale units sold
ws_grossREALWholesale gross sales
sub_unitsINTEGERSubscription units
cogsREALCost of goods sold (units Ã- cogs_per_unit from cogs table)
calculated_atTEXTWhen this row was calculated
Indexes: date + sku (composite PK), date, sku
calc_attribution
Calculated 147,830 rows Jan 2023 â†' Apr 2026 â- ̧
Simplified marketing attribution per order. Maps first/last touch journey data to a single attributed_channel (Meta, Google, Email, Direct, Organic, Other).

Source: transform_attribution.py â†' joins raw_orders + order_journey
ColumnTypeDescription
order_id PKTEXTFK â†' raw_orders.shopify_id
dateTEXTOrder date
first_touch_sourceTEXTFirst-touch source (parsed from UTM/referrer)
first_touch_mediumTEXTFirst-touch medium
first_touch_campaignTEXTFirst-touch campaign name
last_touch_sourceTEXTLast-touch source before conversion
last_touch_mediumTEXTLast-touch medium
last_touch_campaignTEXTLast-touch campaign name
attributed_channelTEXTSimplified channel: Meta, Google, Email, Direct, Organic, Other
revenueREALOrder revenue
is_new_customerINTEGER1 if first order
is_subscriptionINTEGER1 if subscription order
days_to_conversionINTEGERDays from first visit to purchase
touchpoint_countINTEGERNumber of touchpoints in the journey
Indexes: order_id (unique PK), date
calc_customer_cohorts
Calculated 60,591 rows â- ̧
Customer-level cohort analysis. One row per unique customer. Groups customers by their first-order month and calculates LTV at 30/90/365 days.

Source: transform_cohorts.py â†' aggregates raw_orders by customer. Only runs on full rebuild (too expensive for daily).
ColumnTypeDescription
customer_id PKTEXTShopify customer GID
cohort_monthTEXTMonth of first order (YYYY-MM)
first_order_dateTEXTDate of customer's first order
total_ordersINTEGERTotal lifetime orders
total_revenueREALTotal lifetime revenue
ltv_30dREALRevenue within 30 days of first order
ltv_90dREALRevenue within 90 days of first order
ltv_365dREALRevenue within 365 days of first order
is_subscriberINTEGER1 if customer has ever placed a subscription order
last_order_dateTEXTDate of most recent order
Indexes: customer_id (unique PK)

ðŸ"‹ Reference Tables â€" Lookups & Constants

Static or semi-static reference data used by transform scripts. Updated when prices change or new COGS data arrives.

price_list
Reference 10,904 rows â- ̧
Monthly retail price list for every SKU. This is critical â€" it's used to calculate True Gross Sales. Prices change when Teeccino adjusts pricing (e.g., the Jan 2023 price increase).

Source: PriceList/monthly/*.json files on OneDrive, generated by generate_monthly_price_reference.py from Shopify GraphQL. 40 months covered (Jan 2023 → Apr 2026)
ColumnTypeDescription
year_month PKTEXTMonth (YYYY-MM)
sku PKTEXTProduct SKU
retail_priceREALRetail price for this SKU in this month ($)
source_fileTEXTSource JSON filename
Indexes: year_month + sku (composite PK)
cogs
Reference 1,517 rows â- ̧
Cost of Goods Sold per SKU. Used to calculate product-level margins and the Profit Waterfall.

Source: COGS/Cogs.xlsx on OneDrive, imported by create_db.py
ColumnTypeDescription
sku PKTEXTProduct SKU
cogs_per_unitREALCost per unit ($). 0.0 for non-physical items.
effective_dateTEXTWhen this COGS rate became effective (may be null)
source_sheetTEXTSheet in Cogs.xlsx this came from (e.g. "Jan 2026")
Indexes: sku (unique PK)