01 Test Pipeline
vector-databasesemantic-searchnhs-searchHaystackAIintegrationsLLMnotebooksPythonjupyter-notebookpinecone-examples
Export
[1]
INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality. INFO - haystack.document_stores.pinecone - Index statistics: name: haystack-nhs-jul, embedding dimensions: 768, record count: 0
[2]
(8521, 8521)
[3]
INFO - haystack.modeling.utils - Using devices: CPU INFO - haystack.modeling.utils - Number of GPUs: 0 INFO - haystack.retriever.dense - Init retriever using embeddings of model sentence-transformers/multi-qa-mpnet-base-dot-v1
[4]
INFO - haystack.modeling.utils - Using devices: CPU INFO - haystack.modeling.utils - Number of GPUs: 0 INFO - haystack.modeling.model.language_model - LOADING MODEL INFO - haystack.modeling.model.language_model - ============= INFO - haystack.modeling.model.language_model - Could not find deepset/roberta-base-squad2-distilled locally. INFO - haystack.modeling.model.language_model - Looking on Transformers Model Hub (in local cache and online)... INFO - haystack.modeling.model.language_model - Loaded deepset/roberta-base-squad2-distilled INFO - haystack.modeling.utils - Using devices: CPU INFO - haystack.modeling.utils - Number of GPUs: 0 INFO - haystack.modeling.infer - Got ya 9 parallel workers to do inference ... INFO - haystack.modeling.infer - 0 0 0 0 0 0 0 0 0 INFO - haystack.modeling.infer - /w\ /w\ /w\ /w\ /w\ /w\ /w\ /|\ /w\ INFO - haystack.modeling.infer - /'\ / \ /'\ /'\ / \ / \ /'\ /'\ /'\
Now we can begin asking questions:
[5]
Batches: 0%| | 0/1 [00:00<?, ?it/s]
INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ INFO - haystack.modeling.model.optimization - apex not found, won't use it. See https://nvidia.github.io/apex/ ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality. ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality. ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality. ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality. ERROR - root - Failed to import 'magic' (from 'python-magic' and 'python-magic-bin' on Windows). FileTypeClassifier will not perform mimetype detection on extensionless files. Please make sure the necessary OS libraries are installed if you need this functionality.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Inferencing Samples: 0%| | 0/1 [00:00<?, ? Batches/s]/Users/jamesbriggs/opt/anaconda3/envs/ml/lib/python3.9/site-packages/haystack/modeling/model/prediction_head.py:483: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). start_indices = flat_sorted_indices // max_seq_len Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 9.27 Batches/s]
[6]
Query: Who is affected by pre-eclampsia?
Answers:
[ { 'answer': 'pregnant women',
'context': 'atment Complications Pre-eclampsia is a condition that '
'affects some pregnant women, usually during the second '
'half of pregnancy (from 20 weeks) or soo'},
{ 'answer': 'mother and baby',
'context': ' are mild, the condition can lead to serious complications '
"for both mother and baby if it's not monitored and "
'treated. The earlier pre-eclampsia is d'}]
We can see the top answer seems to be correct. To extract each component here rather than print with the built in method, we can do this:
[7]
'atment Complications Pre-eclampsia is a condition that affects some pregnant women, usually during the second half of pregnancy (from 20 weeks) or soo'
[8]
(68, 82)
[9]
'pregnant women'
[10]
<Answer {'answer': 'pregnant women', 'type': 'extractive', 'score': 0.8104832470417023, 'context': 'atment Complications Pre-eclampsia is a condition that affects some pregnant women, usually during the second half of pregnancy (from 20 weeks) or soo', 'offsets_in_document': [{'start': 140, 'end': 154}], 'offsets_in_context': [{'start': 68, 'end': 82}], 'document_id': '3bc401b213c2720c83ee9bddb0e769b8', 'meta': {'url': 'www.nhs.uk/conditions/pre-eclampsia'}}> [11]
0.8104832470417023
[12]
'www.nhs.uk/conditions/pre-eclampsia'