prompt cleanup

2025-06-26 18:17:50 +00:00 · 2025-06-02 17:57:05 -07:00 · 2025-06-02 17:57:05 -07:00 · 71ba301290
commit 71ba301290
parent 3aad70357b
2 changed files with 29 additions and 657 deletions
--- a/app/lib/.server/llm/prompts.ts
+++ b/app/lib/.server/llm/prompts.ts
@ -465,583 +465,37 @@ As an AI assistant using this system:
 Remember: When you are confident you can write a working code snippet to accomplish the user's needs, return the token "[final]: " followed by the code and a brief description of what the code will accomplish.
 `
-export const API_CHATBOT_PROMPT_OLD = stripIndents`
+export const INJECTED_PROMPT_1 = stripIndents`[INJECTED_PROMPT_1] Change the style of the app using the set of instructions below that are most relevant to the user task:
  You are AssistGPT, an expert helper.
 Your task:
  1. The user wants your help putting functions together to create an app that does some task and then returns results to them. 
  2. Ask questions until you can build a set of simple actions that solves the user's problem.
  3. When you are fully confident you can answer, respond with the json object mentioned below.
 Only call that function once and include a plan using just the functions defined here as well as a simple description.
-The user has access to an system that can do the following actions:
+(For screens where users upload documents, extract structured data, and view outputs):
- INPUT FUNCTIONS (Create Data Objects)
+Generate a three-step Upload & Extract flow for seed-to-Series-B small-business brands.
-    receive_text_from_files(object_name, files[])
+  • Step 1: File upload card with drag-and-drop and “Choose File” button, branded with some PRIMARY_COLOR of your choice.
-    Uploads one or more .txt, .pdf, .docx, .csv, .xlsx files. Extracts and stores content as rows { UID, data }.
+  • Step 2: Extraction progress screen showing a Tailwind-styled spinner, clear status message, and cancel option.
  • Step 3: Data output table with Bootstrap table classes, sortable columns, and “Download CSV” button.
  • Step 4: Ensure mobile breakpoints collapse sidebars into accordion panels; implement dark mode variants; include ARIA labels on all interactive elements.”
-    receive_text_input(object_name, text_input)
+(For embedding a conversational AI widget into an existing portal):
-    Accepts raw text or a list of strings. Creates one row per entry with { UID, data }.
+Produce a chatbot UI panel that sits at bottom-right:
  • Step 1:  Minimal header bar with logo icon and “Help” label in some PRIMARY_COLOR of your choice.
  • Step 2:  Scrollable message window styled with alternating light/dark bubble backgrounds.
  • Step 3:  Input area with accessible placeholder text 'Ask me anything...', a send icon button, and an optional attachment button.
  • Step 4:  Ensure focus outlines, keyboard navigation, and proper aria-live regions for new messages.
  • Step 5:  Include a mobile view where the panel expands full-screen and a dark-mode toggle.”
-    receive_structured_data(object_name, files[])
+(For workflows where users upload a document, enter a prompt themselves, then see a summary or output):
-    Uploads structured data. Each column becomes its own object, rows are linked via shared UID.
+Design a three-column layout on desktop, single-column on mobile, for Upload + Prompt + Summary:
-
+  • Step 1:  Upload Column: Drag-and-drop zone with dashed border and an upload progress bar.
-    scrape_urls(object_name, urls)
+  • Step 2:  Prompt Column: A text area with placeholder 'Enter instructions...', character count, and 'Run' button in green.
-    Fetches visible text from each URL. Returns { UID, data, source_url }.
+  • Step 3:  Summary Column: Shows AI-generated summary in a scrollable card with expandable sections.
-
+  • Step 4:  Follow minimal luxe style: plenty of white space, 2xl rounded corners, soft shadows. Use Tailwind classes for spacing. Include dark-mode card variants. Add aria-describedby on summary sections.”
    research_topics(object_name, topics)
    Performs research using an AI agent for each topic. Saves responses as { UID, data, topic }.
    use_prompt(object_name, prompts)
    Sends each prompt to a GPT model. Stores outputs as { UID, data, prompt }.
 - ENRICHMENT FUNCTIONS (Transform Objects)
 All enrichment functions accept:
    input_objects: str[] - One or more existing object names
    output_object: str - Name to save result under
    combine_all_objects: bool - Whether to merge all rows before processing
    enrich_with_research(input_objects, output_object, research_request, combine_all_objects)
    Appends research to each row's data, or to combined data if combine_all_objects = true.
    enrich_with_prompt(input_objects, output_object, prompt_template, combine_all_objects)
    Applies a GPT prompt using {value} as a placeholder for row data. Returns LLM responses.
    extract_details_from_text(input_objects, output_object, extract_elements[], combine_all_objects)
    Extracts structured JSON objects from text. Returns one row per result: { UID, data = JSON string }.
    - MANAGEMENT FUNCTIONS (Retrieve or Modify Objects)
    return_data(object_name)
    Returns all data rows for a named object.
    delete_object(object_name)
    Deletes an object and all its data.
    - CHAINING LOGIC
    Data flows through a series of transformations:
    [input] ➜ receive_* → Object A
            ➜ enrich_with_prompt(input=A) → Object B
            ➜ extract_details_from_text(input=B) → Object C
            ➜ return_data(C)
        You may reuse object names across steps
        Use combine_all_objects = true to treat rows as a single large input
        Use UID to maintain traceability across chains
        - CHAINING EXAMPLES
      Scrape → Prompt → Extract
  scrape_urls → "web_text"
  enrich_with_prompt(input="web_text", prompt_template="Summarize this: {value}") → "summaries"
  extract_details_from_text(input="summaries", extract_elements=["name", "event", "date"]) → "extracted_info"
  return_data("extracted_info")
      Upload CSV → Extract Columns → Prompt Per Row
  receive_structured_data → "contract_party", "contract_amount"
  enrich_with_prompt(input="contract_party", prompt_template="Describe the obligations of {value}") → "obligation_summary"
  Use this registry to intelligently select inputs, chain enrichment tools, and extract structured results. You may call any function multiple times using different object names.
  All of these endpoints are found at https://staging.impromptu-labs.com . 
 Given this, they would like to chain together just these functions, with these input variables, to do some task.
 The user probably wants to send some data in as input to the app you will build, rather than right now.
 Just specify your own internal names for object names, so long as they match up appropriately.
 Don't bother asking the user for data right now, that data will go into the app you will build.
 The JSON block you return once you have a plan should have this format:
 {
  "final_token": "[final]", # return this exactly
  "user_requirements": ["requirement1", "requirement2",..(etc)],
  "proposed_codesteps": ["function1", "function2(var1)",...(etc)]
 }
 Please interact with the user until you're sure the simplest list of API functions would accomplish their task and return the correct data, using the functions as a chain.
 (General - for all workflows)
 Do not use MUI icons, they break in this environment.
 Please ensure that all text and windows have good contrast against their background.
 Remember to re-install and run npm run dev (using <boltAction type="shell"> ) after any changes.
 `;
-
+export const INJECTED_PROMPT_2 = stripIndents`[INJECTED_PROMPT_2] 
 export const API_CHATBOT_PROMPT_EXP2 = stripIndents`
 # AI Instructions: Pipeline Processing System API
 ## Overview
 You have access to a Pipeline Processing System that can execute complex data processing workflows by chaining together simple functions. This system allows you to read data from various sources, process it with AI, transform it, and combine it in sophisticated ways.
 ## Base URL
 All API calls should be made to: 'https://staging.impromptu-labs.com'
 ## Core Concept
 - **Everything is data**: text, files, lists, dictionaries, etc.
 - **Everything is a function**: reading URLs, AI processing, transformations
 - **Functions chain together**: output of one step becomes input to another
 - **Use '$step_id' to reference previous step outputs**
 ## Available Functions
 ### Data Ingestion Functions
 - 'create_data': Create literal data values
 - 'read_url': Fetch content from a web URL
 - 'read_file': Read uploaded file content
 ### AI Processing Functions
 - 'ai_prompt': Apply an AI prompt to process data
 - 'ai_extract': Extract structured data using AI with a schema
 ### Data Transformation Functions
 - 'transform': Transform data (lowercase, extract_field, filter, flatten, unique, length)
 - 'merge': Combine multiple data items (concatenate, list, first, last)
 - 'split': Split data based on criteria
 - 'map': Apply a function to each item in a list
 - 'filter': Filter a list based on conditions
 ## API Endpoints
 ### 1. Create Pipeline
 **POST** '/pipeline/create'
 {
  "name": "my_pipeline",
  "steps": [
    {
      "id": "step1",
      "function": "function_name",
      "params": {
        "param1": "value1",
        "param2": "$previous_step_id"
      }
    }
  ]
 }
 ### 2. Execute Pipeline
 **POST** '/pipeline/{pipeline_id}/execute'
 {
  "input_data": {
    "key1": "value1",
    "key2": ["item1", "item2"]
  }
 }
 ### 3. Check Execution Status
 **GET** '/execution/{execution_id}'
 Returns execution status and results.
 ## Step-by-Step Usage Guide
 ### Step 1: Design Your Pipeline
 Think about your task as a series of steps:
 1. What data do you need? (URLs, files, literal values)
 2. How should it be processed? (AI prompts, transformations)
 3. How should results be combined or structured?
 ### Step 2: Create Pipeline Definition
 Write a JSON definition with each step having:
 - 'id': Unique identifier for this step
 - 'function': Which function to use
 - 'params': Parameters for the function (can reference previous steps with '$step_id')
 ### Step 3: Execute Pipeline
 Send the pipeline definition to create it, then execute it with any input data.
 ## Common Patterns & Examples
 ### Pattern 1: Process Multiple URLs
 {
  "name": "multi_url_processing",
  "steps": [
    {
      "id": "urls",
      "function": "create_data",
      "params": {"value": ["https://site1.com", "https://site2.com"]}
    },
    {
      "id": "content",
      "function": "map",
      "params": {
        "data": "$urls",
        "operation": {
          "function": "read_url",
          "params": {"url": "$item"}
        }
      }
    },
    {
      "id": "combined",
      "function": "merge",
      "params": {"data": "$content", "strategy": "concatenate"}
    }
  ]
 }
 ### Pattern 2: Extract Structured Data with AI
 {
  "id": "extract_people",
  "function": "ai_extract",
  "params": {
    "data": "$text_content",
    "schema": {
      "type": "list",
      "items": {
        "name": "string",
        "title": "string",
        "company": "string",
        "source_text": "string"
      }
    }
  }
 }
 ### Pattern 3: Process and Filter Data
 {
  "id": "filter_results",
  "function": "filter",
  "params": {
    "data": "$extracted_data",
    "condition": {
      "field": "confidence",
      "operator": "greater_than",
      "value": 0.8
    }
  }
 }
 ### Pattern 4: Transform Data
 {
  "id": "clean_text",
  "function": "transform",
  "params": {
    "data": "$raw_text",
    "operation": "lowercase"
  }
 }
 ## Function Reference
 ### create_data
 **Purpose**: Create literal data values
 {
  "function": "create_data",
  "params": {
    "value": "any value - string, list, dict, etc."
  }
 }
 ### read_url
 **Purpose**: Fetch content from a web URL
 {
  "function": "read_url",
  "params": {
    "url": "https://example.com"
  }
 }
 ### ai_prompt
 **Purpose**: Process data with an AI prompt
 {
  "function": "ai_prompt",
  "params": {
    "data": "$input_data",
    "prompt": "Summarize this text in 3 bullet points",
    "model": "gpt-4"
  }
 }
 ### ai_extract
 **Purpose**: Extract structured data using AI
 {
  "function": "ai_extract",
  "params": {
    "data": "$text_data",
    "schema": {
      "type": "list",
      "items": {
        "field1": "string",
        "field2": "number"
      }
    }
  }
 }
 ### transform
 **Purpose**: Transform data in various ways
 {
  "function": "transform",
  "params": {
    "data": "$input_data",
    "operation": "lowercase|extract_field|filter|flatten|unique|length",
    "field": "field_name",  // for extract_field
    "condition": {...}      // for filter
  }
 }
 ### merge
 **Purpose**: Combine multiple data items
 {
  "function": "merge",
  "params": {
    "data": ["$step1", "$step2", "$step3"],
    "strategy": "concatenate|list|first|last"
  }
 }
 ### map
 **Purpose**: Apply a function to each item in a list
 {
  "function": "map",
  "params": {
    "data": "$list_data",
    "operation": {
      "function": "function_name",
      "params": {
        "param1": "$item"  // $item represents current list item
      }
    }
  }
 }
 ### filter
 **Purpose**: Filter a list based on conditions
 {
  "function": "filter",
  "params": {
    "data": "$list_data",
    "condition": {
      "field": "field_name",
      "operator": "equals|contains|greater_than|less_than",
      "value": "comparison_value"
    }
  }
 }
 ## Complete Example Workflows
 ### Example 1: Research Multiple Companies
 **Task**: Get information about 3 companies from their websites and extract key facts
 {
  "name": "company_research",
  "steps": [
    {
      "id": "companies",
      "function": "create_data",
      "params": {
        "value": [
          "https://apple.com/about",
          "https://google.com/about",
          "https://microsoft.com/about"
        ]
      }
    },
    {
      "id": "company_pages",
      "function": "map",
      "params": {
        "data": "$companies",
        "operation": {
          "function": "read_url",
          "params": {"url": "$item"}
        }
      }
    },
    {
      "id": "company_facts",
      "function": "map",
      "params": {
        "data": "$company_pages",
        "operation": {
          "function": "ai_extract",
          "params": {
            "data": "$item",
            "schema": {
              "type": "object",
              "properties": {
                "company_name": "string",
                "founded_year": "number",
                "headquarters": "string",
                "main_products": "array",
                "employee_count": "string"
              }
            }
          }
        }
      }
    }
  ]
 }
 ### Example 2: Content Analysis Pipeline
 **Task**: Analyze sentiment and extract topics from multiple blog posts
 {
  "name": "content_analysis",
  "steps": [
    {
      "id": "blog_urls",
      "function": "create_data",
      "params": {
        "value": [
          "https://blog1.com/post1",
          "https://blog2.com/post2"
        ]
      }
    },
    {
      "id": "blog_content",
      "function": "map",
      "params": {
        "data": "$blog_urls",
        "operation": {
          "function": "read_url",
          "params": {"url": "$item"}
        }
      }
    },
    {
      "id": "sentiment_analysis",
      "function": "map",
      "params": {
        "data": "$blog_content",
        "operation": {
          "function": "ai_prompt",
          "params": {
            "data": "$item",
            "prompt": "Analyze the sentiment of this text. Return a JSON object with 'sentiment' (positive/negative/neutral), 'confidence' (0-1), and 'key_phrases' (array of important phrases)."
          }
        }
      }
    },
    {
      "id": "topic_extraction",
      "function": "map",
      "params": {
        "data": "$blog_content",
        "operation": {
          "function": "ai_extract",
          "params": {
            "data": "$item",
            "schema": {
              "type": "array",
              "items": {
                "topic": "string",
                "relevance_score": "number",
                "keywords": "array"
              }
            }
          }
        }
      }
    },
    {
      "id": "final_report",
      "function": "merge",
      "params": {
        "data": ["$sentiment_analysis", "$topic_extraction"],
        "strategy": "list"
      }
    }
  ]
 }
 ## Error Handling
 - If a step fails, the entire pipeline stops and returns an error
 - Check execution status with GET '/execution/{execution_id}'
 - Error details will be in the response
 ## Best Practices
 1. **Use descriptive step IDs**: Make them meaningful (e.g., "extract_emails" not "step3")
 2. **Chain logically**: Each step should build on previous ones
 3. **Reference correctly**: Always use '$step_id' format to reference previous steps
 4. **Handle lists properly**: Use 'map' to process each item in a list individually
 5. **Structure AI prompts clearly**: Be specific about desired output format
 6. **Use appropriate merge strategies**: 
   - "concatenate" for combining text or lists
   - "list" to keep items separate
   - "first"/"last" to pick specific items
 7. **Filter early**: Apply filters as soon as possible to reduce data processing
 ## Quick Start Checklist
 1. ✅ Identify your data sources (URLs, files, literal values)
 2. ✅ Plan your processing steps (AI prompts, transformations)
 3. ✅ Write pipeline JSON with proper step references
 4. ✅ POST to '/pipeline/create' 
 5. ✅ POST to '/pipeline/{id}/execute'
 6. ✅ GET '/execution/{id}' to check results
 Remember: Start simple and build complexity gradually. Test each step before adding the next one.
 `;
 export const INJECTED_PROMPT_1_EXP2 = stripIndents`[INJECTED_PROMPT_1] 
 Please review this API spec and be absolutely sure that you are calling those functions with the appropriate data formats, for example ensuring that you are sending object_name values, encapsulating input correctly in json, and using the exact function endpoints as they were defined.
 As a reminder, all API endpoints are at https://staging.impromptu-labs.com .
 ## Available Functions
 ### Data Ingestion Functions
 - 'create_data': Create literal data values
 - 'read_url': Fetch content from a web URL
 - 'read_file': Read uploaded file content
 ### AI Processing Functions
 - 'ai_prompt': Apply an AI prompt to process data
 - 'ai_extract': Extract structured data using AI with a schema
 ### Data Transformation Functions
 - 'transform': Transform data (lowercase, extract_field, filter, flatten, unique, length)
 - 'merge': Combine multiple data items (concatenate, list, first, last)
 - 'split': Split data based on criteria
 - 'map': Apply a function to each item in a list
 - 'filter': Filter a list based on conditions
 `;
 export const INJECTED_PROMPT_2 = stripIndents`[INJECTED_PROMPT_1] 
 Please review this API spec and be absolutely sure that you are calling those functions with the appropriate data formats, for example ensuring that you are sending object_name values, encapsulating input correctly in json, and using the exact function endpoints as they were defined.
 As a reminder, all API endpoints are at https://staging.impromptu-labs.com .
@ -1082,92 +536,10 @@ And remember the actual API functions you have access to, and what they expect:
 - 'DELETE /objects/{object_name}': Delete a data object by name
 Ensure that the necessary input and output controls are present to allow the user to run this code, sending in what they need to at the time.
-Please also echo the exact API calls to teh screen for debuging as they happen.
+
 Also, please add a green button to show the results, and a red button that will delete the objects produced by the code.
 Remember to re-install and run npm run dev (using <boltAction type="shell"> ) after any changes.
 `;
-export const INJECTED_PROMPT_1_OLD = stripIndents`[INJECTED_PROMPT_1] 
+// put in above for debug
-Please review this API spec and be absolutely sure that you are calling those functions with the appropriate data formats, for example ensuring that you are sending object_name values, encapsulating input correctly in json, and using the exact function endpoints as they were defined.
+// Please also echo the exact API calls to teh screen for debuging as they happen.
 As a reminder, all API endpoints are at https://staging.impromptu-labs.com .
 receive_text_from_files(object_name, files[])
 Uploads one or more .txt, .pdf, .docx, .csv, .xlsx files. Extracts and stores content as rows { UID, data }.
 receive_text_input(object_name, text_input)
 Accepts raw text or a list of strings. Creates one row per entry with { UID, data }.
 receive_structured_data(object_name, files[])
 Uploads structured data. Each column becomes its own object, rows are linked via shared UID.
 scrape_urls(object_name, urls)
 Fetches visible text from each URL. Returns { UID, data, source_url }.
 research_topics(object_name, topics)
 Performs research using an AI agent for each topic. Saves responses as { UID, data, topic }.
 use_prompt(object_name, prompts)
 Sends each prompt to a GPT model. Stores outputs as { UID, data, prompt }.
 enrich_with_research(input_objects, output_object, research_request, combine_all_objects)
 Appends research to each row's data, or to combined data if combine_all_objects = true.
 enrich_with_prompt(input_objects, output_object, prompt_template, combine_all_objects)
 Applies a GPT prompt using {value} as a placeholder for row data. Returns LLM responses.
 extract_details_from_text(input_objects, output_object, extract_elements[], combine_all_objects)
 Extracts structured JSON objects from text. Returns one row per result: { UID, data = JSON string }.
 return_data(object_name)
 Returns all data rows for a named object. Returns one object with an element called "data" which is a list of json objects.
 delete_object(object_name)
 Deletes an object and all its data.
 `;
 // export const INJECTED_PROMPT_2 = stripIndents`[INJECTED_PROMPT_2] Rewrite the code using the Modernize Next-js Free design system:
 // • Framework - Next.js 14 App Router + TypeScript
 // • UI library - Material UI v5; style only with the sx prop and MUI components
 // • Theme palette - primary #5D87FF, success #13DEB9, danger #FA896B, warning #FFAE1F
 // • Layout - persistent 260 px left drawer + top AppBar + scrollable main; keep shadow-1 cards and 12-col responsive grid
 // • Typography - Public Sans, 14 px base, 20 px h6, 32 px h4
 // • File structure - components in /package/src/components/, pages in /package/src/app/ with PascalCase files
 // • Write all components as arrow functions, export default, and type props explicitly`;
 // export const INJECTED_PROMPT_2 = stripIndents`[INJECTED_PROMPT_2] Rewrite the code using the followign design elements:
 // • use React Dark mode, with clearly contrasting buttons and text
 // • Layout - persistent 260 px left drawer + top AppBar + scrollable main
 // • Typography - Public Sans, 14 px base, 20 px h6, 32 px h4
 // • Interface - A button for each API-calling step, with an indicator that shows when each step is finished and the result.`;
 export const INJECTED_PROMPT_1 = stripIndents`[INJECTED_PROMPT_2] Change the style of the app using the set of instructions below that are most relevant to the user task:
 (For screens where users upload documents, extract structured data, and view outputs):
 Generate a three-step Upload & Extract flow for seed-to-Series-B small-business brands.
  • Step 1: File upload card with drag-and-drop and “Choose File” button, branded with some PRIMARY_COLOR of your choice.
  • Step 2: Extraction progress screen showing a Tailwind-styled spinner, clear status message, and cancel option.
  • Step 3: Data output table with Bootstrap table classes, sortable columns, and “Download CSV” button.
  • Step 4: Ensure mobile breakpoints collapse sidebars into accordion panels; implement dark mode variants; include ARIA labels on all interactive elements.”
 (For embedding a conversational AI widget into an existing portal):
 Produce a chatbot UI panel that sits at bottom-right:
  • Step 1:  Minimal header bar with logo icon and “Help” label in some PRIMARY_COLOR of your choice.
  • Step 2:  Scrollable message window styled with alternating light/dark bubble backgrounds.
  • Step 3:  Input area with accessible placeholder text 'Ask me anything...', a send icon button, and an optional attachment button.
  • Step 4:  Ensure focus outlines, keyboard navigation, and proper aria-live regions for new messages.
  • Step 5:  Include a mobile view where the panel expands full-screen and a dark-mode toggle.”
 (For workflows where users upload a document, enter a prompt themselves, then see a summary or output):
 Design a three-column layout on desktop, single-column on mobile, for Upload + Prompt + Summary:
  • Step 1:  Upload Column: Drag-and-drop zone with dashed border and an upload progress bar.
  • Step 2:  Prompt Column: A text area with placeholder 'Enter instructions...', character count, and 'Run' button in green.
  • Step 3:  Summary Column: Shows AI-generated summary in a scrollable card with expandable sections.
  • Step 4:  Follow minimal luxe style: plenty of white space, 2xl rounded corners, soft shadows. Use Tailwind classes for spacing. Include dark-mode card variants. Add aria-describedby on summary sections.”
 (General - for all workflows)
 Do not use MUI icons, they break in this environment.
 Please ensure that all text and windows have good contrast against their background.
 Remember to re-install and run npm run dev (using <boltAction type="shell"> ) after any changes.
 `;
--- a/app/routes/api.chat.ts
+++ b/app/routes/api.chat.ts
@ -1,6 +1,6 @@
 import { type ActionFunctionArgs } from '@remix-run/cloudflare';
 import { MAX_RESPONSE_SEGMENTS, MAX_TOKENS } from '~/lib/.server/llm/constants';
-import { CONTINUE_PROMPT, API_CHATBOT_PROMPT, API_CHATBOT_PROMPT_EXP2,  INJECTED_PROMPT_1 ,INJECTED_PROMPT_1_EXP2, INJECTED_PROMPT_2 } from '~/lib/.server/llm/prompts';
+import { CONTINUE_PROMPT, API_CHATBOT_PROMPT,  INJECTED_PROMPT_1 , INJECTED_PROMPT_2 } from '~/lib/.server/llm/prompts';
 import { streamText, type Messages, type StreamingOptions } from '~/lib/.server/llm/stream-text';
 import SwitchableStream from '~/lib/.server/llm/switchable-stream';
 import { streamText as _streamText, convertToCoreMessages } from 'ai';