emacs-elpa-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[elpa] externals/llm b9fc46f333 08/13: Resolved merge conflicts and merg


From: ELPA Syncer
Subject: [elpa] externals/llm b9fc46f333 08/13: Resolved merge conflicts and merged upstream/main into ollama-chat-endpoint-support.
Date: Wed, 7 Feb 2024 18:58:11 -0500 (EST)

branch: externals/llm
commit b9fc46f3338fbfd7166cc4fbbaab3b4a397660db
Author: Thomas E. Allen <thomas@assistivemachines.com>
Commit: Thomas E. Allen <thomas@assistivemachines.com>

    Resolved merge conflicts and merged upstream/main into 
ollama-chat-endpoint-support.
---
 NEWS.org       |  5 +++++
 README.org     |  5 ++++-
 llm-gemini.el  | 15 +++++++++------
 llm-openai.el  |  4 ++--
 llm-request.el | 14 +++++++++++++-
 llm-vertex.el  | 55 +++++++++++++++++++++----------------------------------
 llm.el         |  2 +-
 7 files changed, 55 insertions(+), 45 deletions(-)

diff --git a/NEWS.org b/NEWS.org
index 12d46bea89..dd514663b9 100644
--- a/NEWS.org
+++ b/NEWS.org
@@ -1,5 +1,10 @@
+* Version 0.9.1
+- Default to the new "text-embedding-3-small" model for Open AI.  *Important*: 
Anyone who has stored embeddings should either regenerate embeddings 
(recommended) or hard-code the old embedding model ("text-embedding-ada-002").
+- Fix response breaking when prompts run afoul of Gemini / Vertex's safety 
checks.
+- Change Gemini streaming to be the correct URL.  This doesn't seem to have an 
effect on behavior.
 * Version 0.9
 - Add =llm-chat-token-limit= to find the token limit based on the model.
+- Add request timeout customization.
 * Version 0.8
 - Allow users to change the Open AI URL, to allow for proxies and other 
services that re-use the API.
 - Add =llm-name= and =llm-cancel-request= to the API.
diff --git a/README.org b/README.org
index 2a2659e598..2da47be47e 100644
--- a/README.org
+++ b/README.org
@@ -9,7 +9,7 @@ Certain functionalities might not be available in some LLMs. 
Any such unsupporte
 
 This package is still in its early stages but will continue to develop as LLMs 
and functionality are introduced.
 * Setting up providers
-Users of an application that uses this package should not need to install it 
themselves. The llm module should be installed as a dependency when you install 
the package that uses it. However, you do need to require the llm module and 
set up the provider you will be using. Typically, applications will have a 
variable you can set. For example, let's say there's a package called 
"llm-refactoring", which has a variable ~llm-refactoring-provider~. You would 
set it up like so:
+Users of an application that uses this package should not need to install it 
themselves. The llm package should be installed as a dependency when you 
install the package that uses it. However, you do need to require the llm 
module and set up the provider you will be using. Typically, applications will 
have a variable you can set. For example, let's say there's a package called 
"llm-refactoring", which has a variable ~llm-refactoring-provider~. You would 
set it up like so:
 
 #+begin_src emacs-lisp
 (use-package llm-refactoring
@@ -19,6 +19,8 @@ Users of an application that uses this package should not 
need to install it the
 #+end_src
 
 Here ~my-openai-key~ would be a variable you set up before with your OpenAI 
key. Or, just substitute the key itself as a string. It's important to remember 
never to check your key into a public repository such as GitHub, because your 
key must be kept private. Anyone with your key can use the API, and you will be 
charged.
+
+For embedding users. if you store the embeddings, you *must* set the embedding 
model.  Even though there's no way for the llm package to tell whether you are 
storing it, if the default model changes, you may find yourself storing 
incompatible embeddings.
 ** Open AI
 You can set up with ~make-llm-openai~, with the following parameters:
 - ~:key~, the Open AI key that you get when you sign up to use Open AI's APIs. 
 Remember to keep this private.  This is non-optional.
@@ -100,6 +102,7 @@ For all callbacks, the callback will be executed in the 
buffer the function was
 - ~llm-count-tokens provider string~: Count how many tokens are in ~string~.  
This may vary by ~provider~, because some provideres implement an API for this, 
but typically is always about the same.  This gives an estimate if the provider 
has no API support.
 - ~llm-cancel-request request~ Cancels the given request, if possible.  The 
~request~ object is the return value of async and streaming functions.
 - ~llm-name provider~.  Provides a short name of the model or provider, 
suitable for showing to users.
+- ~llm-chat-token-limit~.  Gets the token limit for the chat model.  This 
isn't possible for some backends like =llama.cpp=, in which the model isn't 
selected or known by this library.
 
   And the following helper functions:
   - ~llm-make-simple-chat-prompt text~: For the common case of just wanting a 
simple text prompt without the richness that ~llm-chat-prompt~ struct provides, 
use this to turn a string into a ~llm-chat-prompt~ that can be passed to the 
main functions above.
diff --git a/llm-gemini.el b/llm-gemini.el
index 07b7aaa093..3c80872333 100644
--- a/llm-gemini.el
+++ b/llm-gemini.el
@@ -72,10 +72,13 @@ You can get this at 
https://makersuite.google.com/app/apikey.";
                                     buf error-callback
                                     'error (llm-vertex--error-message 
data))))))
 
-(defun llm-gemini--chat-url (provider)
-  "Return the URL for the chat request, using PROVIDER."
-  (format 
"https://generativelanguage.googleapis.com/v1beta/models/%s:generateContent?key=%s";
+;; from https://ai.google.dev/tutorials/rest_quickstart
+(defun llm-gemini--chat-url (provider streaming-p)
+  "Return the URL for the chat request, using PROVIDER.
+If STREAMING-P is non-nil, use the streaming endpoint."
+  (format 
"https://generativelanguage.googleapis.com/v1beta/models/%s:%s?key=%s";
           (llm-gemini-chat-model provider)
+          (if streaming-p "streamGenerateContent" "generateContent")
           (llm-gemini-key provider)))
 
 (defun llm-gemini--get-chat-response (response)
@@ -85,7 +88,7 @@ You can get this at https://makersuite.google.com/app/apikey.";
 
 (cl-defmethod llm-chat ((provider llm-gemini) prompt)
   (let ((response (llm-vertex--get-chat-response-streaming
-                   (llm-request-sync (llm-gemini--chat-url provider)
+                   (llm-request-sync (llm-gemini--chat-url provider nil)
                                      :data (llm-vertex--chat-request-streaming 
prompt)))))
     (setf (llm-chat-prompt-interactions prompt)
           (append (llm-chat-prompt-interactions prompt)
@@ -94,10 +97,10 @@ You can get this at 
https://makersuite.google.com/app/apikey.";
 
 (cl-defmethod llm-chat-streaming ((provider llm-gemini) prompt 
partial-callback response-callback error-callback)
   (let ((buf (current-buffer)))
-    (llm-request-async (llm-gemini--chat-url provider)
+    (llm-request-async (llm-gemini--chat-url provider t)
                        :data (llm-vertex--chat-request-streaming prompt)
                        :on-partial (lambda (partial)
-                                     (when-let ((response 
(llm-vertex--get-partial-chat-ui-repsonse partial)))
+                                     (when-let ((response 
(llm-vertex--get-partial-chat-response partial)))
                                        (llm-request-callback-in-buffer buf 
partial-callback response)))
                        :on-success (lambda (data)
                                      (let ((response 
(llm-vertex--get-chat-response-streaming data)))
diff --git a/llm-openai.el b/llm-openai.el
index 341275c9c8..fd57d0bd93 100644
--- a/llm-openai.el
+++ b/llm-openai.el
@@ -69,7 +69,7 @@ https://api.example.com/v1/chat, then URL should be
   "Return the request to the server for the embedding of STRING.
 MODEL is the embedding model to use, or nil to use the default.."
   `(("input" . ,string)
-    ("model" . ,(or model "text-embedding-ada-002"))))
+    ("model" . ,(or model "text-embedding-3-small"))))
 
 (defun llm-openai--embedding-extract-response (response)
   "Return the embedding from the server RESPONSE."
@@ -113,7 +113,7 @@ This is just the key, if it exists."
             "/") command))
 
 (cl-defmethod llm-embedding-async ((provider llm-openai) string 
vector-callback error-callback)
-  (llm-openai--check-key provider)
+  (llm-openai--check-key provider)  
   (let ((buf (current-buffer)))
     (llm-request-async (llm-openai--url provider "embeddings")
                        :headers (llm-openai--headers provider)
diff --git a/llm-request.el b/llm-request.el
index a8ee5d489b..4241793c02 100644
--- a/llm-request.el
+++ b/llm-request.el
@@ -25,6 +25,18 @@
 (require 'url-http)
 (require 'rx)
 
+(defcustom llm-request-timeout 20
+  "The number of seconds to wait for a response from a HTTP server.
+
+Request timings are depending on the request. Requests that need
+more output may take more time, and there is other processing
+besides just token generation that can take a while. Sometimes
+the LLM can get stuck, and you don't want it to take too long.
+This should be balanced to be good enough for hard requests but
+not very long so that we can end stuck requests."
+  :type 'integer
+  :group 'llm)
+
 (defun llm-request--content ()
   "From the current buffer, return the content of the response."
   (decode-coding-string
@@ -57,7 +69,7 @@ TIMEOUT is the number of seconds to wait for a response."
         (url-request-extra-headers
          (append headers '(("Content-Type" . "application/json"))))
         (url-request-data (encode-coding-string (json-encode data) 'utf-8)))
-    (let ((buf (url-retrieve-synchronously url t nil (or timeout 5))))
+    (let ((buf (url-retrieve-synchronously url t nil (or timeout 
llm-request-timeout))))
       (if buf
           (with-current-buffer buf
             (url-http-parse-response)
diff --git a/llm-vertex.el b/llm-vertex.el
index 87e4465cab..2427fde5ef 100644
--- a/llm-vertex.el
+++ b/llm-vertex.el
@@ -151,41 +151,28 @@ This handles different kinds of models."
   (pcase (type-of response)
     ('vector (mapconcat #'llm-vertex--get-chat-response-streaming
                         response ""))
-    ('cons (let ((parts (assoc-default 'parts
-                                       (assoc-default 'content 
-                                                      (aref (assoc-default 
'candidates response) 0)))))
-             (if parts
-                 (assoc-default 'text (aref parts 0))
-               "")))))
-
-(defun llm-vertex--get-partial-chat-ui-repsonse (response)
-  "Return the partial response from as much of RESPONSE as we can parse.
-If the response is not parseable, return nil."
+    ('cons (if (assoc-default 'candidates response)
+               (let ((parts (assoc-default
+                             'parts
+                             (assoc-default 'content
+                                            (aref (assoc-default 'candidates 
response) 0)))))
+                 (if parts
+                     (assoc-default 'text (aref parts 0))
+                   ""))
+             "NOTE: No response was sent back by the LLM, the prompt may have 
violated safety checks."))))
+
+(defun llm-vertex--get-partial-chat-response (response)
+  "Return the partial response from as much of RESPONSE as we can parse."
   (with-temp-buffer
     (insert response)
-    (let ((start (point-min))
-          (end-of-valid-chunk
-           (save-excursion
-             (goto-char (point-max))
-             (search-backward "\n," nil t)
-             (point))))
-      (when (and start end-of-valid-chunk)
-        ;; It'd be nice if our little algorithm always worked, but doesn't, so 
let's
-        ;; just ignore when it fails.  As long as it mostly succeeds, it 
should be fine.
-        (condition-case nil
-            (when-let
-                ((json (ignore-errors
-                        (json-read-from-string
-                         (concat
-                          (buffer-substring-no-properties
-                           start end-of-valid-chunk)
-                          ;; Close off the json
-                          "]")))))
-              (llm-vertex--get-chat-response-streaming json))
-          (error (message "Unparseable buffer saved to 
*llm-vertex-unparseable*")
-                 (with-current-buffer (get-buffer-create 
"*llm-vertex-unparseable*")
-                     (erase-buffer)
-                     (insert response))))))))
+    (let ((result ""))
+      ;; We just will parse every line that is "text": "..." and concatenate 
them.   
+      (save-excursion
+        (goto-char (point-min))
+        (while (re-search-forward (rx (seq (literal "\"text\": ")
+                                           (group-n 1 ?\" (* any) ?\") 
line-end)) nil t)
+          (setq result (concat result (json-read-from-string (match-string 
1))))))
+      result)))
 
 (defun llm-vertex--chat-request-streaming (prompt)
   "Return an alist with chat input for the streaming API.
@@ -247,7 +234,7 @@ If STREAMING is non-nil, use the URL for the streaming API."
                      :headers `(("Authorization" . ,(format "Bearer %s" 
(llm-vertex-key provider))))
                      :data (llm-vertex--chat-request-streaming prompt)
                      :on-partial (lambda (partial)
-                                   (when-let ((response 
(llm-vertex--get-partial-chat-ui-repsonse partial)))
+                                   (when-let ((response 
(llm-vertex--get-partial-chat-response partial)))
                                      (llm-request-callback-in-buffer buf 
partial-callback response)))
                      :on-success (lambda (data)
                                    (let ((response 
(llm-vertex--get-chat-response-streaming data)))
diff --git a/llm.el b/llm.el
index ae59017bcd..7b190224b3 100644
--- a/llm.el
+++ b/llm.el
@@ -5,7 +5,7 @@
 ;; Author: Andrew Hyatt <ahyatt@gmail.com>
 ;; Homepage: https://github.com/ahyatt/llm
 ;; Package-Requires: ((emacs "28.1"))
-;; Package-Version: 0.8.0
+;; Package-Version: 0.9.1
 ;; SPDX-License-Identifier: GPL-3.0-or-later
 ;;
 ;; This program is free software; you can redistribute it and/or



reply via email to

[Prev in Thread] Current Thread [Next in Thread]