Articles → LANGCHAIN → Read A PDF File And Store It In Datastax Astra DB

Read A PDF File And Store It In Datastax Astra DB






Create Datastax Astra DB





Activate The Environment




https://www.python.org/downloads/release/python-3110/




py -3.11 -m venv rag_envrag_env\Scripts\activate



Install The Required Libraries




pip install -U langchain langchain-community langchain-openai langchain-text-splitters cassio cassandra-driver openai pypdf



Code


from langchain_community.document_loaders import PyPDFLoaderfrom langchain_text_splitters import RecursiveCharacterTextSplitterfrom langchain_openai import OpenAIEmbeddingsfrom langchain_community.vectorstores.cassandra import Cassandraimport cassio# =========================================# CONFIG# =========================================OPENAI_API_KEY = "your_open_api_key"ASTRA_DB_APPLICATION_TOKEN = "db_token"ASTRA_DB_ID = "db_id"ASTRA_DB_REGION = "us-east-2"KEYSPACE = "default_keyspace"PDF_FILE = "Access Resources In Android Studio.pdf"# =========================================# INIT ASTRA CONNECTION# =========================================cassio.init(    token=ASTRA_DB_APPLICATION_TOKEN,    database_id=ASTRA_DB_ID,    keyspace=KEYSPACE)loader = PyPDFLoader(PDF_FILE)docs = loader.load()splitter = RecursiveCharacterTextSplitter(    chunk_size=500,    chunk_overlap=50)chunks = splitter.split_documents(docs)embeddings = OpenAIEmbeddings(    api_key=OPENAI_API_KEY)vectorstore = Cassandra.from_documents(    documents=chunks,    embedding=embeddings,    keyspace=KEYSPACE,    table_name="pdf_vectors"  )print("PDF stored successfully in Astra DB!")



Output


Picture showing





Posted By  -  Karan Gupta
 
Posted On  -  Saturday, May 23, 2026

Query/Feedback


Your Email Id
 
Subject
 
Query/FeedbackCharacters remaining 250