如何删除文件名或用户名上的表情符号?

问题描述 投票:0回答:1

我的表情符号去除方法有问题。

此方法用于下载聊天记录并检查用户名中是否有表情符号,如果存在则将其删除。它适用于本地和临时环境。但在实际环境中,有时它对具有相同用户名的成绩单有效,但对其他用户名则失败。

这是我得到的错误代码点 U+1F413,ROOSTER

2024-09-26 16:17:20.877 ERROR [whatsapp-api-service,64b76bd04750daa2,64b76bd04750daa2] 3200902 --- [http-nio-8002-exec-2] z.c.e.p.s.UUIDAuthenticationFilter       : Request processing failed; nested exception is Error downloading transcript, U+1F413 ('.notdef') is not available in the font Helvetica, encoding: WinAnsiEncoding

org.springframework.web.util.NestedServletException: Request processing failed; nested exception is Error downloading transcript, U+1F413 ('.notdef') is not available in the font Helvetica, encoding: WinAnsiEncoding

通过数据库手动删除该表情符号后,我能够下载该特定用户的文字记录

这是代码。

private static final List<UnicodeRange> EMOJI_RANGES = Arrays.asList(
            new UnicodeRange(0x1F600, 0x1F64F),  // Emoticons
            new UnicodeRange(0x1F300, 0x1F5FF),  // Misc Symbols and Pictographs
            new UnicodeRange(0x1F680, 0x1F6FF),  // Transport and Map Symbols
            new UnicodeRange(0x2600, 0x26FF),    // Misc Symbols
            new UnicodeRange(0x2700, 0x27BF),    // Dingbats
            new UnicodeRange(0x1F900, 0x1F9FF),  // Supplemental Symbols and Pictographs
            new UnicodeRange(0x1FA70, 0x1FAFF),  // Symbols and Pictographs Extended-A
            new UnicodeRange(0x1F1E6, 0x1F1FF),  // Regional Indicator Symbols
            new UnicodeRange(0xFE00, 0xFE0F),    // Variation Selectors
            new UnicodeRange(0x1F000, 0x1F02F),  // Mahjong Tiles
            new UnicodeRange(0x1F0A0, 0x1F0FF),  // Playing Cards
            new UnicodeRange(0x1F700, 0x1F77F),  // Alchemical Symbols
            new UnicodeRange(0x1F780, 0x1F7FF),  // Geometric Shapes Extended
            new UnicodeRange(0x1F800, 0x1F8FF)   // Supplemental Arrows and Symbols
    );
private boolean isEmoji(int codePoint) {
        for (UnicodeRange range : EMOJI_RANGES) {
            if (range.contains(codePoint)) {
                return true;
            }
        }
        return false;
    }
private String removeEmojis(String input) {
        StringBuilder result = new StringBuilder();
        int length = input.length();
        int i = 0;

        while (i < length) {
            int codePoint = input.codePointAt(i);

            if (Character.isSupplementaryCodePoint(codePoint)) {
                i += Character.charCount(codePoint);
                continue;
            }

            if (!isEmoji(codePoint)) {
                result.append(Character.toChars(codePoint));
            }

            i += Character.charCount(codePoint);
        }

        return result.toString();
    }
private byte[] generatePdfTranscript() throws IOException {
        if (ticketNumber != null) {
            fileName = ticketNumber + "_transcript.pdf";
        } else {
            DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd_HH-mm-ss");
            fileName = removeEmojis(customerName) + "_" + LocalDateTime.now().format(formatter) + "_transcript.pdf";
        }

        ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();

        try (PDDocument document = new PDDocument()) {
            initializeFonts(document);

            PDPage page = new PDPage(PDRectangle.A4);
            document.addPage(page);

            try (PDPageContentStream initialContentStream = new PDPageContentStream(document, page)) {
                float yPosition = page.getMediaBox().getHeight() - MARGIN_Y;
                yPosition = setupPdfHeader(initialContentStream, yPosition, document);
                yPosition = addTranscriptTitle(initialContentStream, yPosition);

                PDPageContentStream currentContentStream = initialContentStream;

                for (Message message : messages) {
                    Pair<Float, PDPageContentStream> result = addMessage(document, currentContentStream, message, yPosition - 15);
                    yPosition = result.getLeft();

                    if (result.getRight() != currentContentStream) {
                        if (currentContentStream != initialContentStream) {
                            currentContentStream.close();
                        }
                        currentContentStream = result.getRight();
                    }
                }

                if (currentContentStream != initialContentStream) {
                    currentContentStream.close();
                }
            }

            document.save(byteArrayOutputStream);
        }

        return byteArrayOutputStream.toByteArray();
    }
 private float setupPdfHeader(PDPageContentStream contentStream, float yPosition, PDDocument document) throws IOException {
        contentStream.saveGraphicsState();

        contentStream.setFont(PDType1Font.HELVETICA_BOLD, FONT_SIZE);

        float pageWidth = document.getPage(0).getMediaBox().getWidth();

        float headerStartY = yPosition + 10;
        float contentWidth = pageWidth - 2 * MARGIN_X;

        if (ticketNumber != null) {
            yPosition = addTextLine(contentStream, "Ticket Number: " + ticketNumber, yPosition);
            yPosition = addTextLine(contentStream, "Date & Time Ticket Issued: " + issuedTime, yPosition);
            yPosition = addTextLine(contentStream, "Date & Time Ticket Closed: " + closedTime, yPosition);
        } else {
            yPosition = addTextLine(contentStream, "No ticket issued", yPosition);
        }

        if (clientWhatsAppNumber != null) {
            yPosition = addTextLine(contentStream, "Service's WhatsApp number: " + clientWhatsAppNumber, yPosition);
        }

        if (customerWhatsAppNumber != null) {
            yPosition = addTextLine(contentStream, "Customer's WhatsApp number: " + customerWhatsAppNumber, yPosition);
        }

        yPosition = addTextLine(contentStream, "Agent's Name & Surname: " + removeEmojis(agentName), yPosition);
        yPosition = addTextLine(contentStream, "Customer's Name & Surname: " + removeEmojis(customerName), yPosition);

        float headerHeight = headerStartY - yPosition + FONT_SIZE; // Add FONT_SIZE to give some bottom padding

        contentStream.setLineWidth(1f);
        contentStream.setStrokingColor(Color.BLACK);
        contentStream.addRect(MARGIN_X - 5, yPosition, contentWidth + 10, headerHeight);
        contentStream.stroke();

        contentStream.restoreGraphicsState();

        yPosition -= FONT_SIZE;

        return yPosition;
    }

    private float addTranscriptTitle(PDPageContentStream contentStream, float yPosition) throws IOException {
        contentStream.setFont(PDType1Font.HELVETICA_BOLD, FONT_SIZE + 2);
        return addTextLine(contentStream, "Ticket Transcript:", yPosition - 20);
    }

我为导致错误的用户名编写了单元测试,并且所有测试都通过了。我将代码部署到临时服务器,它按预期工作,成功下载了带有表情符号的用户名的文字记录。然而,在实际环境中,它对某些具有相同用户名的票证有效,但对其他票证则失败。直接从数据库中删除表情符号后,我能够下载出现错误的票证的文字记录。

java spring-boot pdf web-applications emoji
1个回答
0
投票

Character.isEmoji

正如 g00se 评论的那样,你可以大大简化你的表情符号删除代码。

Java 21+ 在

Character

 类上提供了多种方法来识别表情符号:isEmoji
isEmojiPresentation
isEmojiModifier
isEmojiModifierBase
isEmojiComponent
。我没有研究过这些,建议你研究一下。但作为一个例子,我会
猜测下面的代码可以做到。

String strippedOfEmoji = input .codePoints( ) .filter( codepoint -> !Character.isEmoji( codepoint ) ) .filter( codepoint -> !Character.isEmojiPresentation( codepoint ) ) .filter( codepoint -> !Character.isEmojiModifier( codepoint ) ) .filter( codepoint -> !Character.isEmojiComponent( codepoint ) ) .collect( StringBuilder :: new , StringBuilder :: appendCodePoint , StringBuilder :: append ) .toString( );
让我们尝试一下该代码。

String input = "abc🐓xyz";

strippedOfEmoji = abcxyz

Character

类提供了许多此类类别检测方法,因此在编写自己的方法之前请先研究一下。

UTF-8

真正的解决方案是将终端应用程序的字符编码从

WinAnsiEncoding

 更改为 
UTF-8

然后您应该能够看到 Unicode 中定义的所有 150,000 个左右的字符,前提是您已经更新了字体。

© www.soinside.com 2019 - 2024. All rights reserved.