-
DeepSeek models
-
The name of the inference service is now included in the
Model
parameters andResults
objects. This can be useful when the same model is provided by multiple services. -
The model pricing page at Coop shows daily test results for available models: https://www.expectedparrot.com/home/pricing
- Default size limits on question texts have been removed.
- Modified default RPM to avoid timeout issues.
- Question type
QuestionDict
returns a response as a dictionary with specified keys and (optionally) specified value types and descriptions. Details: https://docs.expectedparrot.com/en/latest/questions.html#questiodict-class
-
Results of jobs run remotely are no longer automatically synced to your local cache. Now, a new cache for results is automatically generated and attached to a results object; you can access it by calling
results.cache
. Results now also include the following fields for the associated cache:cache_keys.<question_name>_cache_key
(the unique identifier for a cache entry) andcache_used.<question_name>_cache_used
(an indicator whether the default cache was used to provide the response--this is either your local cache or remote cache, or a cache that was passed to therun
method, if used instead of local or remote). -
Improvements to the web-based progress bar for remote jobs.
- Occasional timeout issue should be fixed by modifications to caching noted above.
-
Question type
QuestionMatrix
. Details: https://docs.expectedparrot.com/en/latest/questions.html#questionmatrix-class -
A
join()
method for objects. -
FileStore
methodcreate_link()
embeds a file in the HTML of a notebook and generates a download link for it. Examples: https://docs.expectedparrot.com/en/latest/filestore.html
-
Exceptions report is displayed as a clickable link.
-
Improvements to table display of results returned by
select()
method. -
Improvements to status messages displayed in a table log when a job is running.
-
Model.available()
now uses Coop by default (all models available with remote inference are returned). If remote inference is not activated then only models available locally are returned (based on stored personal API keys).
- Progress bar shows total interviews instead of total unique interviews (iterations may be >1).
Results
are now automatically displayed in a scrollable table when you callselect()
on them. You can also calltable().long()
to display results in a long-view table. This replaces the need to callprint(format="rich")
. See examples in the starter tutorial.
- The progress bar is now web-based and a link to view it in a new tab is automatically returned when you call the
run()
method on a survey (progress_bar=True
by default). See examples in the starter tutorial.
- Results were automatically appending cache; this was removed.
- EDSL Authentication Token: If you attempt to run a survey remotely without having stored your EXPECTED_PARROT_API_KEY, a message will appear providing a Coop login link. Clicking this link and logging in will automatically store your key in your .env file.
-
The
AgentList
methodfrom_csv()
now allows you to (optionally) automatically specify thename
parameters for agents by including a column "name" in the CSV. Other columns are (still) passed as agenttraits
. See an example: https://docs.expectedparrot.com/en/latest/agents.html#from-a-csv-file -
The
Job
methodrun()
now takes a parameterremote_inference_results_visibility
to set the visibility of results of jobs that are being run remotely. The allowed visibility settings arepublic
,private
andunlisted
(the default setting is unlisted). This parameter has the same effect as passing the parametervisibility
to thepush()
andpatch()
methods for posting and updating objects at the Coop. For example, these commands have the same effect when remote inference activated:
Survey.example().run()
Survey.example().run(remote_inference_visibility="unlisted")
-
Bug in using f-strings and scenarios at once. Example usage: https://docs.expectedparrot.com/en/latest/scenarios.html#using-f-strings-with-scenarios
-
Bug in optional question parameters
answering_instructions
andquestion_presentation
, which can be used to modify user prompts separately from modifying question texts. Example usage: https://docs.expectedparrot.com/en/latest/questions.html#optional-question-parameters
-
Method
show_prompts()
can be called on aSurvey
to display the user prompt and system prompt. This is in addition to the existing methodprompts()
that is called on aJob
which will return the prompts and additional information about the questions, agents, models and estimated costs. Learn more: https://docs.expectedparrot.com/en/latest/prompts.html -
Documentation on storing API keys as "secrets" for using EDSL in Colab.
-
Conversation
module works with multiple models at once. -
Improved features for adding new models.
- Access to Open AI o1 models
-
Survey Builder is a new interface for creating and launching hybrid human-AI surveys. It is fully integrated with EDSL and Coop. Get access by activating beta features from your Coop account profile page. Learn more: https://docs.expectedparrot.com/en/latest/survey_builder.html
-
Jobs
methodshow_prompts()
returns a table showing the user and system prompts that will be used with a survey, together with information about the agent and model and estimated cost for each interview.Jobs
methodprompts
returns the information in a dataset. -
Scenario
objects can contain multiple images to be presented to a model at once (works with Google models).
- Bug in piping a
ScenarioList
containing multiple lists ofquestion_options
to use with questions.
-
Optional parameters for
Question
objects:include_comment = False
prevents acomment
field from being added to a question (default isTrue
: all question types other than free text automatically include a field for the model to comment on its answer, unless this parameter is passed)use_code = True
modifies user prompts for question types that takequestion_options
to instruct the model to return the integer code for an option instead of the option value (default isFalse
)answering_instructions
andquestion_presentation
allow you to control exact prompt language and separate instructions for the presentation of a questionpermissive = True
turns off enforcement of question constraints (e.g., if min/max selections for a checkbox question have been specified, you can setpermissive = True
to allow responses that contain fewer or greater selections) (default isFalse
)
-
Methods for
Question
objects:loop()
generates a list of versions of a question for aScenarioList
that is passed to it. Questions are constructed with a{{ placeholder }}
for a scenario as usual, but each scenario value is added to the question when it is created instead of when a survey is run (which is done with theby()
method). Survey results for looped questions include fields for each unique question but noscenario
field. See examples: https://docs.expectedparrot.com/en/latest/starter_tutorial.html#adding-scenarios-using-the-loop-method and https://docs.expectedparrot.com/en/latest/scenarios.html#looping
-
Methods for
ScenarioList
objects:unpivot()
expands a scenario list by specified identifierspivot()
undoesunpivot()
, collapsing scenarios by identifiersgive_valid_names()
generates valid Pythonic identifiers for scenario keysgroup_by()
groups scenarios by identifiers or applies a function to the values of the specified variablesfrom_wikipedia_table()
converts a Wikipedia table into a scenario list. See examples: https://docs.expectedparrot.com/en/latest/notebooks/scenario_list_wikipedia.htmlto_docx()
exports scenario lists as structured Docx documents
-
Optional parameters for
Model
objects:raise_validation_errors = False
causes exceptions to only be raised (interrupting survey execution) when a model returns nothing at all (default:raise_validation_errors = True
)print_exceptions = False
causes exceptions to not be printed at all (default:print_exceptions = True
)
-
Columns in
Results
for monitoring token usage:generated_tokens
shows the tokens that were generated by the modelraw_model_response.<question_name>_cost
shows the cost of the result for the question, applying the token quanities & pricesraw_model_response.<question_name>_one_usd_buys
shows the number of results for the question that 1USD will buyraw_model_response.<question_name>_raw_model_response
shows the raw response for the question
-
Methods for
Results
objects:tree()
displays a nested tree for specified componentsgenerate_html()
andsave_html()
generate and save HTML code for displaying results
-
General improvements to exceptions reports.
-
General improvements to the progress bar:
survey.run(progress_bar=True)
-
Question validation methods no longer use JSON. This will eliminate exceptions relating to JSON errors previously common to certain models.
-
Base agent instructions template is not added to a job if no agent is used with a survey (reducing tokens).
-
The
select()
method (forResults
andScenarioList
) now allows partial match on key names to save typing.
-
Bug in enforcement of token/rate limits.
-
Bug in generation of exceptions report that excluded agent information.
-
Models: AWS Bedrock & Azure
-
Question: New method
loop()
allows you to create versions of questions when you are constructing a survey. It takes aScenarioList()
as a parameter and returns a list ofQuestion
objects.
- Bug in
Survey
question piping prevented you from adding questions after piping.
-
ScenarioList.from_sqlite
allows you to create a list of scenarios from a SQLite table. -
Added LaTeX support to SQL outputs and ability to write to files:
Results.print(format="latex", filename="example.tex")
-
Options that we think of as "terminal", such as
sql()
,print()
,html()
, etc., now take atee
boolean that causes them to returnself
. This is useful for chaining, e.g., if you runprint(format = "rich", tee = True)
it will returnself
, which allows you do also runprint(format = "rich", tee = True).print(format = "latex", filename = "example.tex")
.
- Ability to create a
Scenario
forquestion_options
. Example:
from edsl import QuestionMultipleChoice, Scenario
q = QuestionMultipleChoice(
question_name = "capital_of_france",
question_text = "What is the capital of France?",
question_options = "{{question_options}}"
)
s = Scenario({'question_options': ['Paris', 'London', 'Berlin', 'Madrid']})
results = q.by(s).run()
- Prompts visibility: Call
prompts()
on aJobs
object for a survey to inspect the prompts that will be used in a survey before running it. For example:
from edsl import Model, Survey
j = Survey.example().by(Model())
j.prompts().print(format="rich")
-
Piping: Use agent traits and components of questions (question_text, answer, etc.) as inputs to other questions in a survey (e.g.,
question_text = "What is your last name, {{ agent.first_name }}?"
orquestion_text = "Name some examples of {{ prior_q.answer }}"
orquestion_options = ["{{ prior_q.answer[0]}}", "{{ prior_q.answer[1] }}"]
). Examples: https://docs.expectedparrot.com/en/latest/surveys.html#id2 -
Agent traits: Call agent traits directly (e.g.,
Agent.example().age
will return22
).
- A bug in piping to allow you to pipe an
answer
intoquestion_options
. Examples: https://docs.expectedparrot.com/en/latest/surveys.html#id2
-
Method
add_columns()
allows you to add columns toResults
. -
Class
ModelList
allows you to create a list ofModel
objects, similar toScenarioList
andAgentList
.
Conjure
module allows you to import existing survey data and reconstruct it as EDSL objects. See details on methodsto_survey()
,to_results()
,to_agent_list()
and renaming/modifying objects: https://docs.expectedparrot.com/en/latest/conjure.html
- Method
rename()
allows you to rename questions, agents, scenarios, results.
- New language models from OpenAI, Anthropic, Google will be added automatically when they are released by the platforms.
- Removed an errant break point in language models module.
-
Scenario.rename()
allows you to rename fields of a scenario. -
Scenario.chunk()
allows you to split a field into chunks of a given size based onnum_word
ornum_lines
, creating aScenarioList
. -
Scenario.from_html()
turns the contents of a website into a scenario. -
Scenario.from_image()
creates an image scenario to use with a vision model (e.g., GPT-4o). -
ScenarioList.sample()
allows you to take a sample from a scenario list. -
ScenarioList.tally()
allows you to tally fields in scenarios. -
ScenarioList.expand()
allows you to expand a scenario by a field in it, e.g., if a scenario field contains a list the method can be used to break it into separate scenarios. -
ScenarioList.mutate()
allows you to add a key/value to each scenario. -
ScenarioList.order_by()
allows you to order the scenarios. -
ScenarioList.filter()
allows you to filter the scenarios based on a logical expression. -
ScenarioList.from_list()
allows you to create a ScenarioList from a list of values and specified key. -
ScenarioList.add_list()
allows you to use a list to add values to individual scenarios. -
ScenarioList.add_value()
allows you to add a value to all the scenarios. -
ScenarioList.to_dict()
allows you to turn a ScenarioList into a dictionary. -
ScenarioList.from_dict()
allows you to create a ScenarioList from a dictionary. -
Results.drop()
complementsResults.select()
for identifying the components that you want to print in a table. -
ScenarioList.drop()
similarly complementsScenarioList.select()
.
- Improvements to exceptions reports: Survey run exceptions now include the relevant job components and are optionally displayed in an html report.
-
We started a blog! https://blog.expectedparrot.com
-
Agent
/AgentList
methodremove_trait(<trait_key>)
allows you to remove a trait by name. This can be useful for comparing combinations of traits. -
Agent
/AgentList
methodtranslate_traits(<codebook_dict>)
allows you to modify traits based on a codebook passed as dictionary. Example:
agent = Agent(traits = {"age": 45, "hair": 1, "height": 5.5})
agent.translate_traits({"hair": {1:"brown"}})
This will return: Agent(traits = {'age': 10, 'hair': 'brown', 'height': 5.5})
-
AgentList
methodget_codebook(<filename>)
returns the codebook for a CSV file. -
AgentList
methodfrom_csv(<filename>)
loads anAgentList
from a CSV file with the column names astraits
keys. Note that the CSV column names must be valid Python identifiers (e.g.,current_age
and notcurrent age
). -
Results
methodto_scenario_list()
allows you to turn any components of results into a list of scenarios to use with other questions. A default parameterremove_prefixes=True
will remove the results component prefixesagent.
,answer.
,comment.
, etc., so that you don't have to modify placeholder names for the new scenarios. Example: https://docs.expectedparrot.com/en/latest/scenarios.html#turning-results-into-scenarios -
ScenarioList
methodto_agent_list()
converts aScenarioList
into anAgentList
. -
ScenarioList
methodfrom_pdf(<filename>)
allows you to import a PDF and automatically turn the pages into a list of scenarios. Example: https://docs.expectedparrot.com/en/latest/scenarios.html#turning-pdf-pages-into-scenarios -
ScenarioList
methodfrom_csv(<filename>)
allows you to import a CSV and automatically turn the rows into a list of scenarios. -
ScenarioList
methodfrom_pandas(<dataframe>)
allows you to import a pandas dataframe and automatically turn the rows into a list of scenarios. -
Scenario
methodfrom_image(<image_path>)
creates a scenario with a base64 encoding of an image. The scenario is formatted as follows:"file_path": <filname / url>, "encoded_image": <generated_encoding>
Note that you need to use a vision model (e.g.,model = Model('gpt-4o')
) and you do not need to add a{{ placeholder }}
for the scenario (for now--this might change!). Example:
from edsl.questions import QuestionFreeText
from edsl import Scenario, Model
model = Model('gpt-4o')
scenario = Scenario.from_image('general_survey.png') # Image from this notebook: https://docs.expectedparrot.com/en/latest/notebooks/data_labeling_agent.html
# scenario
q = QuestionFreeText(
question_name = "example",
question_text = "What is this image showing?" # We do not need a {{ placeholder }} for this kind of scenario
)
results = q.by(scenario).by(model).run(cache=False)
results.select("example").print(format="rich")
Returns:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ answer ┃
┃ .example ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ This image is a flowchart showing the process of creating and administering a survey for data labeling tasks. │
│ The steps include importing data, creating data labeling tasks as questions about the data, combining the │
│ questions into a survey, inserting the data as scenarios of the questions, and administering the same survey to │
│ all agents. │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-
Question
andSurvey
methodhtml()
generates an improved html page representation of the object. You can optionally specify the filename and css. See default css:edsl/edsl/surveys/SurveyExportMixin.py
Line 10 in 9d981fa
-
QuestionMultipleChoice
now takes numbers and lists asquestion_options
(e.g.,question_options = [[1,2,3], [4,5,6]]
is allowed). Previously options had to be a list of strings (i.e.,question_options = ['1','2','3']
is still allowed but not required).
- Optional parameter in
Results
methodto_list()
to flatten a list of lists (eg, responses toQuestionList
):results.to_list(flatten=True)
- Erroneous error messages about adding rules to a survey.
- New
Survey
method to export a survey to file. Usage:generated_code = survey.code("example.py")
- A bug in
Survey
methodadd_skip_logic()
- New methods for adding, sampling and shuffling
Results
objects:dup_results = results + results
results.shuffle()
results.sample(n=5)
-
Optional parameter
survey.run(cache=False)
if you do not want to access any cached results in running a survey. -
Instructions passed to an agent at creation are now a column of results:
agent_instruction
- Methods for setting session caches
New function
set_session_cache
will set the cache for a session:
from edsl import Cache, set_session_cache
set_session_cache(Cache())
The cache can be set to a specific cache object, or it can be set to a dictionary or SQLite3Dict object:
from edsl import Cache, set_session_cache
from edsl.data import SQLiteDict
set_session_cache(Cache(data = SQLiteDict("example.db")))
# or
set_session_cache(Cache(data = {}))
The unset_session_cache
function is used to unset the cache for a session:
from edsl import unset_session_cache
unset_session_cache()
This will unset the cache for the current session, and you will need to pass the cache object to the run method during the session.
Details: https://docs.expectedparrot.com/en/latest/data.html#setting-a-session-cache
-
Answer comments are now a separate component of results The "comment" field that is automatically added to each question (other than free text) is now stored in
Results
ascomment.<question_name>
. Prior to this change, the comment for each question was stored asanswer.<question_name>_comment
, i.e., if you ranresults.columns
the list of columns would includeanswer.<question_name>
andanswer.<question_name>_comment
for each question. With this change, the columns will now beanswer.<question_name>
andcomment.<question_name>_comment
. This change is meant to make it easier to select only the answers, e.g., runningresults.select('answer.*').print()
will no longer also include all the comments, which you may not want to display. (The purpose of the comments field is to allow the model to add any information about its response to a question, which can help avoid problems with JSON formatting when the model does not want to return just the properly formatted response.) -
Exceptions We modified exception messages. If your survey run generates exceptions, run
results.show_exceptions()
to print them in a table.
- A package that was missing for working with Anthropic models.
-
Results
objects now include columns for question components. Call the.columns
method on your results to see a list of all components. Runresults.select("question_type.*", "question_text.*", "question_options.*").print()
to see them. -
Survey
objects now also have a.to_csv()
method.
- Increased the maximum number of multiple choice answer options to 200 (previously 20) to facilitate large codebooks / data labels.
- A bug in in
Survey.add_rule()
method that caused an additional question to be skipped when used to apply a skip rule.
- New models: Run
Model.available()
to see a complete current list.
- A bug in json repair methods.
-
New documentation: https://docs.expectedparrot.com
-
Progress bar: You can now pass
progress_bar=True
to therun()
method to see a progress bar as your survey is running. Example:
from edsl import Survey
results = Survey.example().run(progress_bar=True)
Job Status
Statistic Value
─────────────────────────────────────────────────────────────────
Elapsed time 1.1 sec.
Total interviews requested 1
Completed interviews 1
Percent complete 100 %
Average time per interview 1.1 sec.
Task remaining 0
Estimated time remaining 0.0 sec.
Model Queues
gpt-4-1106-preview;TPM (k)=1200.0;RPM (k)=8.0
Number question tasks waiting for capacity 0
new token usage
prompt_tokens 0
completion_tokens 0
cost $0.00000
cached token usage
prompt_tokens 104
completion_tokens 35
cost $0.00209
- New language models: We added new models from Anthropic and Databricks. To view a complete list of available models see edsl.enums.LanguageModelType or run:
from edsl import Model
Model.available()
This will return:
['claude-3-haiku-20240307',
'claude-3-opus-20240229',
'claude-3-sonnet-20240229',
'dbrx-instruct',
'gpt-3.5-turbo',
'gpt-4-1106-preview',
'gemini_pro',
'llama-2-13b-chat-hf',
'llama-2-70b-chat-hf',
'mixtral-8x7B-instruct-v0.1']
For instructions on specifying models to use with a survey see new documentation on Language Models. Let us know if there are other models that you would like us to add!
- Cache: We've improved user options for caching LLM calls.
Old method:
Pass a use_cache
boolean parameter to a Model
object to specify whether to access cached results for the model when using it with a survey (i.e., add use_cache=False
to generate new results, as the default value is True).
How it works now:
All results are (still) cached by default. To avoid using a cache (i.e., to generate fresh results), pass an empty Cache
object to the run()
method that will store everything in it. This can be useful if you want to isolate a set of results to share them independently of your other data. Example:
from edsl.data import Cache
c = Cache() # create an empty Cache object
from edsl.questions import QuestionFreeText
results = QuestionFreeText.example().run(cache = c) # pass it to the run method
c # inspect the new data in the cache
We can inspect the contents:
Cache(data = {‘46d1b44cd30e42f0f08faaa7aa461d98’: CacheEntry(model=‘gpt-4-1106-preview’, parameters={‘temperature’: 0.5, ‘max_tokens’: 1000, ‘top_p’: 1, ‘frequency_penalty’: 0, ‘presence_penalty’: 0, ‘logprobs’: False, ‘top_logprobs’: 3}, system_prompt=‘You are answering questions as if you were a human. Do not break character. You are an agent with the following persona:\n{}’, user_prompt=‘You are being asked the following question: How are you?\nReturn a valid JSON formatted like this:\n{“answer”: “<put free text answer here>“}‘, output=’{“id”: “chatcmpl-9CGKXHZPuVcFXJoY7OEOETotJrN4o”, “choices”: [{“finish_reason”: “stop”, “index”: 0, “logprobs”: null, “message”: {“content”: “```json\\n{\\“answer\\“: \\“I\‘m doing well, thank you for asking! How can I assist you today?\\“}\\n```“, “role”: “assistant”, “function_call”: null, “tool_calls”: null}}], “created”: 1712709737, “model”: “gpt-4-1106-preview”, “object”: “chat.completion”, “system_fingerprint”: “fp_d6526cacfe”, “usage”: {“completion_tokens”: 26, “prompt_tokens”: 68, “total_tokens”: 94}}’, iteration=0, timestamp=1712709738)}, immediate_write=True, remote=False)
For more details see new documentation on Caching LLM Calls.
Coming soon: Automatic remote caching options.
- API keys:
You will no longer be prompted to enter your API keys when running a session. We recommend storing your keys in a private
.env
file in order to avoid having to enter them at each session. Alternatively, you can still re-set your keys whenever you run a session. See instructions on setting up an.env
file in our Starter Tutorial.
The Expected Parrot API key is coming soon! It will let you access all models at once and come with automated remote caching of all results. If you would like to test it out, please let us know!
- Prompts: We made it easier to modify the agent and question prompts that are sent to the models. For more details see new documentation on Prompts.
Model
attributeuse_cache
is now deprecated. See details above about how caching now works.
.run(n = ...)
now works and will run your survey with fresh results the specified number of times.
- Various fixes and small improvements
- The raw model response is now available in the
Results
object, accessed via "raw_model_response" keyword. There is one for each question. The key is the question_name +_raw_response_model
- The
.run(progress_bar = True)
returns a much more informative real-time view of job progress.
- The
answer
component of theResults
object is printed in a nicer format.
trait_name
descriptor was not working; it is now fixed.QuestionList
is now working properly again
- Results now provides a
.sql()
method that can be used to explore data in a SQL-like manner. - Results now provides a
.ggplot()
method that can be used to create ggplot2 visualizations. - Agent now admits an optional
name
argument that can be used to identify the Agent.
- Fixed various issues with visualizations. They should now work better.
- Question options can now be 1 character long or more (down from 2 characters)
- Fixed a bug where prompts displayed were incorrect (prompts sent were correct)
- Report functionalities are now part of the main package.
- Fixed a bug in the Results.print() function
- The package no longer supports a report extras option.
- Fixed a bug in EndofSurvey
- Better handling of async failures
- Fixed bug in survey logic
- Improvements in async survey running
- Added logging
- Improvements in async survey running
- Improvements in async survey running
- Support for several large language models
- Async survey running
- Asking for API keys before they are used
- Bugs in survey running
- Bugs in several question types
- Unused files
- Unused package dependencies
- Changelog file
- Image display and description text in README.md
- Unused files
- Base feature