Skip to content

This JavaFX GUI program was created to allow a user to enter a website, start and finish points to determine the frequency of each word within the selected region of text.

Notifications You must be signed in to change notification settings

derekdotdev/JavaFX-Word-Frequency-Counter

Repository files navigation

JavaFX Word Frequency Counter

By default, this JavaFX program was created to scrape [The Raven by Edgar Allen Poe eBook on Project Gutenberg's website] (https://www.gutenberg.org/files/1065/1065-h/1065-h.htm),

and track each word and its occurrences before returning the top 10 occurrences to a GUI. It now has the ability to accept any URL and a start + end point to calculate the number of occurences of each word on a given web page.

[Here's a video demonstration] (https://vimeo.com/636527956)

This is done in the following steps:
  • Accept the URL, start and end points from the user,

  • Process the user input:

    • Add each line of the page to a string buffer and write buffer to a single String object
    • Use replaceAll() and RegEx to strip away html tags and toLowerCase() to avoid duplicate entries ('The' and 'the').
    • Split string into individual words using split() with RegEx to ignore all characters except alphabet and apostrophe (allow contractions) and add resultant words to a String array.
    • Iterate over String array and add unique entries to local MySQL database (formerly HashMap<String, Integer> with key=word, value=occurrences).
    • Finally, iterate over data and display Top 10 and All results to the JavaFX GUI

This project was created for CEN3024C - Software Development I - Module 6 and is saved locally as cen3024-module-006-word-frequency-count-gui

About

This JavaFX GUI program was created to allow a user to enter a website, start and finish points to determine the frequency of each word within the selected region of text.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published