Category: Uncategorized

  • Collect, Enrich, & Store News Data

    2. Developed with the company, Python scripts to retrieve feed items daily and enhance them with additional metadata. 3. Learn and use Java-based Stanford Natural Language Processing (NLP) tools and server to extract interesting entities from the title and short summary. Stanford CoreNLP (“part of speech” classifier) splits sentences into tokens and performs Named Entity…

  • Train IPTC Category Classifier

    1. IPTC Media Topic Codes – Classification codes used to label the data based on what category the news article would fall under. 2. Develop Queries 3. Word Embeddings – a real-valued vector that encodes the meaning of a word in a way that the words that are closer in the vector space are expected…

  • Test & Visualize IPTC Category Classification Results

    The results were monitored through a piece of software called Tensorboard in correlation to Tensorflow and the results were verified using a confusion matrix to indicate how accurate the results were. Results from Neutrality Classifier: Input: {“article_id”: “cb3b1c0e-f002-4c8d-8119-b78b0f7578e4″,”fingerprint”: [{“term”: “explosion”},{“term”: “rocket”},{“term”: “spacex”}] Output: {“fingerprint”: [{“term”: “explosion”,”value”: “01100101011110000111000001101100011011110111001101101001011011110110111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000”},{“term”: “spacex”,”value”: “01110011011100000110000101100011011001010111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000”}],”articleprint”: “000000000000010000001300000001110011011000110110100101100101011011100110001101100101001000000110000101101110011001000010000001110100011001010110001101101000011011100110111101101100011011110110011101111001000000000000000000000000”,”articleprint_confidence”: {“0″: 0.00020429682626854628,”1″: 0.0036759688518941402,”2″: 0.00007796963473083451,”3″: 0.00016331530059687793,”4”:…

css.php