我正在尝试将父子数据导入ES。
这是我的logstash 配置文件:
input{
jdbc{
jdbc_driver_library => "/usr/library/postgresql-42.7.4.jar"
jdbc_connection_string => "jdbc:postgresql://185.**.**.28:5432/feed"
jdbc_user => "airflow"
jdbc_password => "**"
jdbc_driver_class => "org.postgresql.Driver"
statement => "select distinct source from public.news"
}
}
filter{
jdbc_streaming {
jdbc_driver_library => "/usr/library/postgresql-42.7.4.jar"
jdbc_connection_string => "jdbc:postgresql://185.**.**.28:5432/feed"
jdbc_user => "airflow"
jdbc_password => "**"
jdbc_driver_class => "org.postgresql.Driver"
parameters => {"source" => "source"}
statement => "select title,article, url, views from public.news where source = :source limit 10"
target => "posts"
}
mutate{
remove_field => ["@version","@timestamp"]
}
}
output{
elasticsearch{
hosts => ["https://172.21.0.2:9200"]
ssl_certificate_authorities => '/usr/library/http_ca.crt'
ssl_verification_mode => "full"
user => "elastic"
password => "*****"
index => "content2"
document_id => "%{source}"
}
stdout {
codec => rubydebug
}
}
这是自动创建的索引映射:`
{
"mappings": {
"properties": {
"posts": {
"properties": {
"article": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
...
}
}
posts
属性nested type
posts
的子属性source
?非常感谢任何帮助!谢谢你
您看到的是使用elasticsearch的默认映射创建的索引。
在摄取文档之前,您需要明确告诉 Elasticsearch 您的索引映射是什么。
这可以通过以下方式完成:
我正在创建以下索引
79084663
,其中 posts
字段作为嵌套字段。
PUT 79084663/
{
"mappings": {
"properties": {
"source":{
"type": "text"
},
"posts": {
"type": "nested",
"properties": {
"url": {
"type": "keyword"
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"article": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}