Wednesday, September 30, 2015

Receive a word and read a text file and count how many times the word appears in the text file

This is a sample code which receives a word and read a text file and count how many times the word appears in the text file. If you are using Mac, this program might make an error because it uses .useDelimiter("\\r\\n").

import java.io.BufferedReader; import java.io.File; import java.io.FileNotFoundException; import java.io.InputStreamReader; import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Pattern; class ScanTwo{ public static void main(String args[]){ try{ System.out.println("Input a path of the file:"); BufferedReader input = new BufferedReader (new InputStreamReader (System.in)); String str1 = input.readLine( ); System.out.println("Your file's local path is: " + str1); File file = new File(str1); Scanner scan = new Scanner(file); scan.useDelimiter("¥¥r¥¥n"); System.out.println("Input a word:"); BufferedReader input2 = new BufferedReader (new InputStreamReader (System.in)); String str3 = input2.readLine( ); System.out.println("Your word is: " + str3); int count = 0; while(scan.hasNext()) { String str2 = scan.next(); Pattern pattern1 = Pattern.compile(str3); Matcher m = pattern1.matcher(str2); while (m.find()) { count++; } } scan.close(); System.out.println(count); }catch(FileNotFoundException e){ System.out.println(e); }catch(Exception ex){ ex.printStackTrace(); } } }

Let me explain the detail. This program scans from the command line by this code:

BufferedReader input =  new BufferedReader (new InputStreamReader (System.in));
String str1 = input.readLine( );

This makes "input" object and the content of the input is substituted to the variable str1. That's why

  System.out.println("Your file's local path is: " + str1);

this shows what the content of str1...which means this shows the input.
  File file = new File(str1);
  Scanner scan = new Scanner(file);
  scan.useDelimiter("¥¥r¥¥n");
The "File" is a file class. I made an object "file" from the File class. This File class and the object are used to deal with files in Java. "scan" is an object of Scanner class. This "scan.useDelimiter("¥¥r¥¥n");" is written to designate a delimiter and separate the data.
  BufferedReader input2  = new BufferedReader (new InputStreamReader (System.in));
  String str3 = input2.readLine( );
This is written to scan again from the command line.. to designate a word to search.




and...
  while(scan.hasNext()) {
    String str2 = scan.next();
    Pattern pattern1 = Pattern.compile(str3);
        Matcher m = pattern1.matcher(str2);
         while (m.find()) {
             count++;
         }
      }
This is the most difficult part. This repeat a same action as long as there is something to scan in the file (see the condition of while statement). str3 is a word which was input from the command line.

In this program, str2 stores "scan.next". scan is an object of Scanner class. This scan data between one delimiter and the other delimiter.  If there is a word which matches the word str3 inside the words str2, m.find becomes "true". When m.find is true, count adds 1 to itself.

That's why when the variable "count" is displayed, we can see how many times the words appeared in the text file.