Short Answer:
No not out of the box for much of its functionality.Too much code relies on
File
objects and creating temporary File
objects which don't exist in GAE.Long Answer:
It is open source you can hack at the code and change the stuff that calls methods that takeFile
objects to take InputStream
objects instead, and then you can process things that live in the Blobstore
or GCS
.Here is an example I am hacking at now:
@NotNull public static Metadata readMetadata(@NotNull File file) throws JpegProcessingException, IOException { JpegSegmentReader segmentReader = new JpegSegmentReader(file); return extractMetadataFromJpegSegmentReader(segmentReader.getSegmentData()); }
Where there is this perfectly good call that isn't tied to the File
object:@NotNull public static Metadata readMetadata(@NotNull InputStream inputStream, final boolean waitForBytes) throws JpegProcessingException { JpegSegmentReader segmentReader = new JpegSegmentReader(inputStream, waitForBytes); return extractMetadataFromJpegSegmentReader(segmentReader.getSegmentData()); }
Some parse() methods will create temp files if you pass in the File type directly or if you created TikaInputStream from a file. You can also trigger it by calling getFile() or getFileChannel() on TikaInputStream. So you may be able to control it by creating the TikaInputStream yourself and avoiding using a File object in the process (ie loading the file into memory first or streaming it somehow). However, if the parser implementation calls getFile() or getFileChannel() for you then you're out of luck, short of implementing the parser yourself.
Read full article from java - Can I use Tika for content extraction on Google App Engine? - Stack Overflow
No comments:
Post a Comment